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Abstract 

The accelerating expansion of the universe is the most surprising cosmological discovery in many 
decades, implying that the universe is dominated by some form of "dark energy" with exotic physical 
properties, or that Einstein's theory of gravity breaks down on cosmological scales. The profound 
implications of cosmic acceleration have inspired ambitious efforts to understand its origin, with 
experiments that aim to measure the history of expansion and growth of structure with percent-level 
precision or higher. We review in detail the four most well established methods for making such 
measurements: Type la supernovae, baryon acoustic oscillations (BAO), weak gravitational lensing, 
and the abundance of galaxy clusters. We pay particular attention to the systematic uncertainties 
in these techniques and to strategies for controlling them at the level needed to exploit "Stage IV" 
dark energy facilities such as BigBOSS, LSST, Euclid, and WFIRST. We briefly review a number 
of other approaches including redshift-space distortions, the Alcock-Paczynski effect, and direct 
measurements of the Hubble constant Hq. We present extensive forecasts for constraints on the 
dark energy equation of state and parameterized deviations from General Relativity, achievable with 
Stage III and Stage IV experimental programs that incorporate supernovae, BAO, weak lensing, 
and cosmic microwave background data. We also show the level of precision required for clusters 
or other methods to provide constraints competitive with those of these fiducial programs. We 
emphasize the value of a balanced program that employs several of the most powerful methods in 
combination, both to cross-check systematic uncertainties and to take advantage of complementary 
information. Surveys to probe cosmic acceleration produce data sets that support a wide range of 
scientific investigations, and they continue the longstanding astronomical tradition of mapping the 
universe in ever greater detail over ever larger scales. 
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1. Introduction 



Gravity pulls. Newton's Principia generalized this longstanding fact of human experience into a 
universal attractive force, providing compelling explanations of an extraordinary range of terrestrial 
and celestial phenomena. Newtonian attraction weakens with distance, but it never vanishes, and 
it never changes sign. Einstein's theory of General Relativity (GR) reproduces Newtonian gravity 
in the limit of weak spacetime curvature and low velocities. For a homogeneous universe filled with 
matter or radiation, GR predicts that the cosmic expansion will slow down over time, in accord with 
Newtonian intuition. In the late 1990s, however, two independent studies of distant supernovae 



found that the expansion of th e universe has accelerated over the last five billion years (iRiess et al. 



19981 : IPerlmutter et al.l . Il999l ^. a remarkable discovery that is now buttressed by multiple lines of 



independent evidence. On the scale of the cosmos, gravity repels. 

Cosmic acceleration is the most profound puzzle in contemporary physics. Even the least exotic 
explanations demand the existence of a pervasive new component of the universe with unusual 
physical properties that lead to repulsive gravity. Alternatively, acceleration could be a sign that 
GR itself breaks down on cosmological scales. Cosmic acceleration may be the crucial empirical clue 
that leads to understanding the interaction between gravity and the quantum vacuum, or reveals 
the existence of extra spatial dimensions, or sheds light on the nature of quantum gravity itself. 

Because of these profound implications, cosmic acceleration has inspired a wide range of am- 
bitious experimental efforts, which aim to measure the expansion history and growth of structure 
in the cosmos with percent-level precision or better. In this article, we review the observational 
methods that underlie these efforts, with particular attention to techniques that are likely to see 
major advances over the next decade. We will emphasize the value of a balanced program that 
pursues several of these methods in combination, both to cross-check systematic uncertainties and 
to take advantage of complementary information. 

The remainder of this introduction briefly recaps the history of cosmic acceleration and current 
theories for its origin, then sets this article in the context of future experimental efforts and other 
reviews of the field. Section [2] describes the basic observables that can be used to probe cosmic 
acceleration, relates them to the underlying equations that govern the expansion history and the 
growth of structure, and introduces some of the parameters commonly used to define "generic" 
cosmic acceleration models. It concludes with an overview of the leading methods for measuring 
these observables. In §^3H6]we go through the four most well developed methods in detail: Type la 
supernovae, baryon acoustic oscillations (BAO), weak gravitational lensing, and clusters of galaxies. 
Section [7] summarizes several other potential probes, whose prospects are currently more difficult 
to assess but in some cases appear quite promising. Informed by the discussions in these sections, 
^ presents our principal new results: forecasts of the constraints on cosmic acceleration models 
that could be achieved by combining results from these methods, based on ambitious but feasible 
experiments like the ones endorsed by the Astro2010 Decadal Survey report. New Worlds, New 
Horizons in Astronomy and Astrophysics. We summarize the implications of our analyses in ^ . 

1.1. History 

Just two years after the completion of General Relativity, Einstein ( 191?! ) introduced the first 



modern cosmological model. With little observational guidance, Einstein assumed (correctly) that 
the universe is homogeneous on large scales, and he proposed a matter-filled space with finite, posi- 
tively curved, 3-sphere geometry. He also assumed (incorrectly) that the universe is static. Finding 
these two assumptions to be incompatible, Einstein modified the GR field equation to include the 
infamous "cosmological term," now usually known as the "cosmological constant" and denoted A. 
In effect, he added a new component whose repulsive gravity could balance the attractive gravity of 
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the matter (though he did not describe his modification in these terms). In the 1920s, iFriedmann 
(1922) and iLemaitrd (|l927l ) introduced GR-based cosmological models with an expanding or con- 
tracting universe, some of them including a cosmological constant , othe rs not. In 1929, Hubble 
discovered direct evidence for the expansion of the universe (lHubblel . [l929l ;). thus removing the orig- 
inal motivation for the A term|^ In 1965, the discovery and inte rpretation of the cosmic microwave 
background (CMB; Penzias and Wilson 19651 : Dicke et al. 19651 ) provided the pivotal evidence for 
a hot big bang origin of the cosmos. 

From the 1930s through the 1980s, a cosmological constant seemed unnecessary to explaining 
cosmological observations. The "cosmological constant problem" as it was defined in the 1980s 
was a theoretical one: why was the gravitational impact of the quantum vacuum vanishingly small 
compared to the "naturally" expected value (see ^1.2p ? In the late 1980s and early 1990s, however, 
a variety of indirect evidence began to accumulate in favor of a cosmological constant. Studies of 
large scale galaxy clustering, interpreted in the framework of cold dark matter (CDM) models with 
inflat ionary initial conditions, implied a low matt er density parameter = Pm/ Pent ~ 0.15 — 0.4 
(e.g., iMaddox et aD Il990l : lifstathiou et aD Il990l ). in agreement with direct dynamical estimates 
that assumed galaxies to be fair tracers of the mass distribution. Recon ciling t his re sult with 
the standard inflationary cosmology prediction of a spatially flat universe (jGuthl . Il98ll ) required 
a new energy component with density parameter 1 — Vtm- Open-uni verse inflation mod els were 
also considered, but explaining the observed homogeneity of the CMB (ISmoot et al.l . Il992) in such 



19951) rather 



models required speculative appeals to quantum gravity effects (e.g., iBucher et al. 
than the semi-classical explanation of traditional inflation. 

By the mid-1990s, many cosmological simulation studies included both open-CDM models 
and A-CDM models, along with Q.^ = 1 models incorporating tilted in flationary spectra. , non- 



standard radiat i on components, o r massive neutrino components (e.g., Ostriker and Cen 19961 : 



Cole et al. 1997 : Gross et al. 1998 : Jenkins et al. 19981 ). Once normalized to the observed level 



of CMB anisotropies, the large-scale structure predictions of open and flat-A models differed at 
the tens-of -percent level, w ith flat models generally yielding a more natural flt to the observa- 
tions (e.g.. Cole et al. 199?! ). Conflict between high values of the Hubble const a nt and the ages 



of globular clusters also fav ored a cosmological constant (e.g.. iPierce et al.lll994l : iFreedman et al 



1993: IChabover et al.l 119961 1. though the frequency of gravitational lenses pointed in the opposite 
direction ( Kochanekl . 199m . Thus, the combination of CMB data, large-scale structure data, age 
of the universe, and inflationary theory led many cosmologists to consid er models with a cos- 
mological constant, and some to declare it as the pref erred solution (e.g., Efstathiou et al. 1990l : 
Krauss and Turner 1995 : Ostriker and Steinhardt 19951 ). 

Enter the supernovae. In the mid-1990s, two teams set out to measure the cosmic deceleration 
rate, and thereby determine the matter density parameter Q-m-, by discovering and monitoring high- 
redshift. Type la supernovae. The recognition that the peak l uminosity o f supe rnovae was tightly 
correlated with the shape of the light curve (jPhillipsl . I1993I : iRiess et al.l . Il996l l played a critical 
role in this strategy, reducing the intrinsic distan ce error per supernova t o ~ 10%. While the flrst 
analysis of a small sample indicated deceleration ( Perlmutter et al. . 19971 ). by 1998 the two teams 
had converged on a remarkable result: when compared to local Type la supernovae, the supernovae 
at z ~ 0.5 we re fainter than expected in a matter-domi nated universe with ^ 0.2 by about 0.2 
mag, or 20% teiess et al.l . ll998l : IPerlmutter et al.l . ll999l l. Even an empty, freely expanding universe 
was inconsistent with the observations. Both teams interpreted their measurements as evidence for 



^Seve ral recent papers have addressed the contribution of Lemaitre to this discovery; the story is interestingly 
tangled l|Blockl . I2OIII : (van den Berghl . boill ) . 
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an accelerating universe with a cosmological constant, consistent with a flat universe (J^tot = 1) 
having f^A ~ 0.7. 

Why was the supernova evidence for cosmic acceleration accepted so quickly by the community 
at large? First, the internal checks carried out by the two teams, and the agreement of their con- 
clusions despite independent observations and many differences of methodology, seemed to rule out 
many forms of observational systematics, even relatively subtle effects of photometric calibration 
or selection bias. Second, the ground had been well prepared by the CMB and large scale struc- 
ture data, which already provided substantial indirect evidence for a cosmological constant. This 
confluence of arguments favored the cosmological interpretation of the results over astrophysical 
explanations such as evolution of the supernova population or grey dust extinction that increased 
towards higher redshifts. Third, the supernova results were followed within a year by the results of 
balloon-borne CMB experiments that mapped the fir st acoustic peak and measured its angular lo- 
cat ion, providing strong e vidence for spatial flatness ( de Bernardis et al. 200d : Hananv et al. 200d : 



see 



Netterfield et al .11 199 71 for earlier ground-based measurements hinting at the same result). On its 
own, the acoustic peak only implied J7tot ^ not a non-zero Qa, but it dovetailed perfectly with the 
estimates of and from large scale structure and supernovae. Furthermore, the acoustic peak 
measurement implied that the alternative to A was not an open universe but a strongly decelerating, 
Clm = 1 universe that disagreed with the supernova data by 0.5 magnitudes, a level much harder to 
explain with observational or astrophysical effects. Finally, the combination of spatial flatness and 



impr oving measurements of the Hubble constant (e.g., Hq = 71 ± 6 km s ^ Mpc ^; iMould et al 



2003) provided an entirely independent argument for an energetically dominant accelerating com 



ponent: a matter-dominated universe with fitot = 1 would have age to = (2/3)ij n ^ 9.5 Gyr, t oo 
young to accommodate the 12-14 Gyr ages estimated for globular clusters (e.g.. IChaboyer 19981 ). 



A decade later, the web of observational evidence for cosmic acceleration is intricate and robust. 
A wide range of observations, including larger and better calibrated supernova samples over a wider 
redshift range, high-precision CMB data down to small angular scales, the baryon acoustic scale in 
galaxy clustering, weak lensing measurements of dark matter clustering, the abundance of massive 
clusters in X-ray and optical surveys, the level of structure in the Lyo forest, and precise measure- 
ments of Hq, are all consistent with an inflationary cold dark matter model with a cosmological 
constant, commonly abbreviated as acdmI Explaining all of these data simultaneously requires 
an accelerating universe. Completely eliminating any one class of constraints (e.g., supernovae, or 
CMB, or Hq) would not change this conclusion, nor would doubling the estimated systematic errors 
on all of them. The question is no longer whether the universe is accelerating, but why. 

1.2. Theories of Cosmic Acceleration 

A cosmological constant is the mathematically simplest solution to the cosmic acceleration 
puzzle. While Einstein introduced his cosmological term as a modification to the curvature side of 
the field equation, it is now more common to interpret A as a new energy component, constant in 
space and time. For an ideal fluid with energy density u and pressure p, the effective gravitational 
source term in GR is {u + 3p)/c^, reducing to the usual mass density p = ujc? if the fluid is 
non-relativistic. For a component whose energy density remains constant as the universe expands, 
the first law of thermodynamics implies that when a comoving volume element in the universe 
expands by a (physical) amount (W , the corresponding change in energy is related to the pressure 
via —pdV = dU = udV . Thus, p = —u, making the gravitational source term {u+3p)/c^ = —2u/cP. 
A form of energy that is constant in space and time must have a repulsive gravitational effect. 



^Many of the relevant observational references will appear in subsequent sections on specific topics. 
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According to quantum field theory, "empty" space is filled with a sea of virtual particles. It 
would be reasonable to interpret the cosmological constant as the gravitational signature of this 
quantum vacuum energy, much as the Lamb shift is a signature of its electromagnetic effects H 
The problem is one of magnitude. Since virtual particles of any allowable mass can come into 
existence for short periods of time, the "natural" value for the quantum vacuum density is one 
Planck Mass per cubic Planck Length. This density is about 120 orders of magnitude larger than 
the cosmological constant suggested by observations: it would drive accelerated expansion with a 
timescale of tpianck ~ 10~^^ sec instead of tnubbie ~ 10^* sec. Since the only "natural" number close 
to 10^^^*^ is zero, it was generally assumed (prior to 1990) that a correct calculation of the quantum 
vacuum energy would eventually show it to be z ero, or at least suppressed by an extremely large 
exponential factor (see review bv [Weinberg Il989l ). But the discovery of cosmic acceleration raises 
the possibility that the quantum vacuum really does act as a cosmological constant, and that its 
energy scale is 10^^ eV rather than 10^^ eV for reasons that we do not yet understand. To date, 
there are no compelling theoretical arguments that explain either why the fundamental quantum 
vacuum energy might have this magnitude or why it might be zero. 

The other basic puzzle concerning a cosmological constant is: Why now? The ratio of a constant 
vacuum energy density to the matter density scales as a?{t), so it has changed by a factor of ~ 10^ 
since big bang nucleosynthesis and by a factor ~ 10^^ since the electroweak symmetry breaking 
epoch, which seems (based on our current understanding of physics) like the last opportunity for 
a major rebalancing of matter and energy components. It therefore seems remarkably coincidental 
for the vacuum energy density and the matter energy density to have the same order of magnitude 
today. In the late 1970s, Robert Dick e used a similar line of reasoning to argue for a spatially flat 
universe (see iDicke and PeebWl979l), an argument that provided much of the initial motivation 
for inflationary theory (jCuth. 1981i ). However, while the universe appears to be impressively close 
to spatial flatness, the existence of two energy components with different a{t) scalings means that 
Dicke's "coincidence problem" is still with us. 

One possible solution to the coincidence problem is anthropic: if the vacuum energy assumes 
widely different values in different regions of the universe, then conscious observers will find them- 
selves in reg i ons o f the universe where the vacuum energy is low enough to allow structure formation 
(|Efstathioul . Il99,4 iMartel et al. l. Il998 l). This type of explanation finds a natural home in "multi- 
verse" models of eternal inflation, where different histories of spontaneou s symrnetry b reaking lead 
to different values of physical constants in each non- inflating "bubble" dLindel . Il987l ). and it has 
gained new prominence in the context of string theory, whi ch predict s a "l andscape" of vacua 
arising from different compactifications of spatial dimensions (ISusskindl . [200^ ). One can attempt 
to derive an expec tation value of the observed cosmological constant from such arguments (e.g., 



Martel et al.lll998l). but t he re sults are sensitive to the choice of parameters that are allowed to 



vaxv jTe.mark and R^^^ . S) and to the choice of measure on pLrameter space, so it is hard to 
take such "predictions" beyond a qualitative level. A variant on these ideas is that the effective 
value (and perhaps even the sign) of the cosmological constant varies in time, and that structure 
will form and observers arise during periods when its magnitude i s anomalously low c ompared to 
its natural (presumably Planck- level) energy scale ( Brandenberger . 20021 : Griest, 20021 ). 

A straightforward alternative to a cosmological constant is a field with ne gative pressure (and 
thus repulsive gravi t ational effect) whose energy d ensity changes with time (jRatra and Peeblesl . 
19881 : iFrieman et al.l . Il995l : iFerreira and Jovcd . 119971 ). In particular, a canonical scalar field <j) with 



■^This interpretat i on of tiie cosmological constant in the context of quantum field theory was revived in the late 
f960s bv lZel'dovichl l|l968l ): for further discussion of the history see [Peebles and Ratral (|2003l ). 
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potential has energy density and pressure 



P<t> 



1 1 

2/1? 
1 1 
2Tu? 



(1) 



so if the kinetic term is subdominant, then ~ —u^. A slowly rolling scalar field of this sort is 
analogous to the inflaton field hypothesized to drive inflation, but at an energy scale many, many 
orders of magnitude lower. In general, a scalar field has an equation-of-state parameter 



w 



p 
u 



(2) 



that is greater than —1 and varies in time, while a true cosmological constant has w = —1 at all 
times. Some forms of V{(j)) allow attractor or "tracker" solutions in which the late-tiine evo lution 
of (j) is insensitive to the initial conditions ( Ratra and Peebles . 1988 : Steinhardt et al. . 19991 ) . and 
a subset of these all ow to track the inatter energy density at early times, ameliorating the 
coincidence problem ( Skordis and Albrecht . 20021 ) . Some choices give a nearly constant w that is 
different from —1, while others have ~ — 1 as an asympto tic state at either ear l y or l ate times, 
referred to respectively as "thawing" or "freezing" solutions (jCaldwell and Linder . 2005 ). 

Scalar field mo dels in which the energy density is dominated by V{(l)) are popularly known 
as "quintessence" ( Zlatev et al. . 19991 ). A number of variations have been proposed in which the 
energy density of the field is dominated by kine tic, spin, or oscillatory degrees of freedom (e.g., 
Armendariz-Picon et al.l l200ll : iBovle et al.l |2002| ) . Other models introduce non-canonical kinetic 
terms or couple the field to dark matter. Models differ in the evolution of u^{a) and w{a), and 
some have other distinctive features such as large scale energy density fluctuations that can affect 
CMB anisotropics. Of course, none of these models addresses the original "cosmological constant 
problem" of why the true vacuum energy is unobservably small, 

The alternative to introducing a new energy component is to modify General Relativity itself 
on cosmological scales, for example by r eplacing the Ricci scalar R i i i the gravitat ional action with 
some higher order function f{R) (e.g., Capozziello and Fang 20021 : Carroll 20031 ). or by allowing 
gravity to "l e ak" i nto an extra dimension in a way that reduces its attractive effect at large scales 
(|Dvali et al.l . l200d l. GR modifications can alter the relation between the expansion history and 
the growth of matter clustering, and, as discussed in subsequent sections, searching for mismatches 
between observational probes of expansion and observational probes of structure growth is one 
generic approach to seeking signatures of modified gravity. To be consistent with tight constraints 
from solar system tests, modifications of gravity must generally be "shielded" on small scales, by 
mechanisms such a s the "chameleon" effect , the "symmetron" mechanism, or "Vainshtein screening" 
(see the review by Jain and Khoury 20ld ). These mechanisms can have the effect of introducing 
intermediate scale forces. GR modifications can also alter the relation between non-relativistic 
matter clustering and gravitational lensing, which in standard GR is controlled by two different 
potentials that are equal to each other for fluids without anisotropic stress. 

The distinction between a new energy component and a modification of gravity may be am- 
biguous. The most obvious ambiguous case is the cosmological constant itself, which can be placed 
on either the "curvature" side or the "stress-energy" side of the Einstein field equation. More 
generally, many theories with f{R) modifications of the gravitational action can be written in a 
mathematically equivale nt form of GR plus a scalar field with specified properties ( Chiba . 20031 : 
Kunz and Sapond . 120071 ). Relative to expectations for a cosmological constant or a simple scalar 
field model, models in which dark matter decays into dark energy can produce a mismatch be- 
tween the histories of expansion and structure growth while maintaining GR (e.g., iJain and Zhang 
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2008 : Wei and Zhang 20081 ). Thus, even perfect measurements of all relevant observables may not 
uniquely locate the explanation of cosmic acceleration in the gravitational or stress-energy sector. 

While the term "dark energy" seems to presuppose a stress-energy explanation, in practice it 
has become a generic term for referring to the cosmic acceleration phenomenon. In particular, 
the phrase "dark energy experiments" has come to mean observational studies aimed at measuring 
acceleration and uncovering its cause, regardless of whether that cause is a new energy field or a 
modification of gravity. We will generally adopt this common usage of "dark energy" in this review, 
though where the distinction matters we will try to use "cosmic acceleration" as our generic term. It 
is important to keep in mind that we presently have strong observational evidence for accelerated 
cosmic expansion but no compelling evidence that the cause of this acceleration is really a new 
energy component. 

The magnitude and coincidence problems are challenges for any explanation of cosmic accel- 
eration, whether a cosmological constant, a scalar field, or a modification of GR. The coincidence 
problem seems like an important clue for identifying a correct solution, and some models at least 
reduce its severity by coupling the matter and dark energy densities in some way. Multiverse 
models with anthropic selection arguably offer a solution to the coincidence problem, because if the 
probability distribution of vacuum energy densities rises swiftly towards high values, then structure 
may generically form at a time when the matter and vacuum energy density values are similar, in 
that small subset of universes where structure forms at all. But sometimes a coincidence is just 
a coincidence. Essentially all current theories of cosmic acceleration have one or more adjustable 
parameters whose value is tuned to give the observed level of acceleration, and none of them yield 
this level as a "natural" expectation unless they have built it in ahead of time. These theories 
are designed to explain acceleration itself rather than emerging from independent theoretical con- 
siderations or experimental constraints. Conversely, a theory that provided a compelling account 
of the observed magnitude of acceleration — like GR's successful explanation of the precession of 
Mercury — would quickly jump to the top of the list of cosmic acceleration models. 



1.3. Looking Forward 

The deep mystery and fundamental implications of cosmic acceleration have inspired numerous 
ambitious observational efforts to me asure its history and, it is hoped, reveal its origin. The report of 



the Dark Energy Task Force (DETF: lAlbrecht et al.ll200^ ) played a critical role in systematizing the 



field, by categorizing experimental approaches and providing a quantitative framework to compare 
their capabilities. The DETF categorized then-ongoing experiments as "Stage 11" (following the 
"Stage I" discovery experiments) and the next generation as "Stage III." It looked forward to a 
generation of more capable (and more expensive) "Stage IV" efforts that might begin observations 
around the second half of the coming decade. The DETF focused on the same four methods that 
will be the primary focus of this review: Type la supernovae, baryon acoustic oscillations (BAO), 
weak gravitational lensing, and clusters of galaxies. 

Four years on, the main "Stage II" experiments have completed their observations though 
not necessarily their final analyses. Prominent ex amples include the supernova and weak le nsing 
programs of the CFHT Lega cv Survev (CFHTLS: ISemboloni et al.l l20od: IConlev et alll201lh . the 
ESSENCE superno va survey (IWood-Vasev et al.l. boOTl'l . BAO measurements from the Sloan Digital 
Sky Survey CSD SS:lEisenstein et al.ll2005l : IPercival et alllioiol ). and the SDSS-II supernova survey 



(jFrieman et al.l . l2008bl ) . These have been complemented by extensive multi- wavelength stu dies of 



local and high-redshif t supernovae such as the Carnegie Supernova Project (jHamuv et al. 1, [2OO6; 



Freedman et al.l. l2009l ). by systematic searches for z > 1 supernovae with Hubble Space Telescope 



(|Riess et al.1 . l2007h . 3y dark energy constraints from the evolution of X-ray or optically selected 
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im- 



clusters (jHenrv et alJ . l2009l : iMantz et all . l20ld: IVikhlinin et all, boogi: IRozo et all l20ld ). by 
proved measurement s of the Hubble cons tant ( Riess et al. . 2009 . 201 ll ). and by CMB data from 
the VKM^P satellite ([Bennett et all hooj ) and from ground-based experiments that probe smaller 
angular scales Most data remain consistent with a spatially flat universe and a cosmological 
constant with Q\ = 1 — 0^. ~ 0.75, with an uncertainty in the equation-of-state parameter w that 
is roughly ±0.1 at the 1 — 2cj level. Substantial further improvement will in many cases require 
reduction in systematic errors as well as increased statistical power from larger data sets. 

The clearest examples of "Stage III" experiments, now in the late construction or early op- 
erations phase, are the Dark Energy Survey (DES), Pan- STARR^l, the Baryon Oscillation Spec- 
troscopic Survey (BOSS) of SDSS-III, and the Hobby-Eberly Telescope Dark Energy Experiment 
(HETDEX)ll All four projects are being carried out by international, multi-institutional collabo- 
rations. Pan-STARRS and DES will both carry out large area, multi-ba nd imaging surveys tha t 
go a factor of ten or more deeper (in flux) than the SDSS imaging survey (jAbazaiian et al.l . l2009l l. 
using, respectively, a 1.4-Gigapixel camera on the 1.8-m PSl telescope on Haleakala in Hawaii and a 
0.5-Gigapixel camera on the 4-m Blanco telescope on Cerro Tololo in Chile. These imaging surveys 
will be used to measure structure growth via weak lensing, to identify galaxy clusters and calibrate 
their masses via weak lensing, and to measure BAO in galaxy angular clustering using photometric 
redshifts. Each project also plans to carry out monitoring surveys over smaller areas to discover 
and measure thousands of Type la supernovae. Fully exploiting BAO requires spectroscopic red- 
shifts, and BOSS will carry out a nearly cosmic-variance limited survey (over 10^ deg^) out to 
z ~ 0.7 using a 1000-fiber spectrograph to measure redshifts of 1.5 million luminous galaxies, and a 
pioneering quasar survey that will measure BAO at z ~ 2.5 by using the Lya forest along 150,000 
quasar sightlines to trace the underlying matter distribution. HETDEX plans a BAO survey of 10^ 
Lya-emitting galaxies at z « 3. 

There are many other ambitious observational efforts that do not fit so neatly into the definition 
of a "Stage III dark energy experiment" but will nonetheless play an important role in "Stage III" 
constraints. A predecessor to BOSS, the WiggleZ project on the Anglo-Australian 3.9-m telescope, 
has recently comp leted a spectroscopic survey of 240,000 emission line galaxies out to z = 1.0 



([Blake et al.l . |201l|). The Hyper Suprime-Cam (HSC) facility on the Subaru telescope will have 
wide-area imaging capabilities comparable to DES and Pan-STARRS, and it is likely to devote 
substantial fractions of its time to weak lensing surveys. Other examples include intensive spectro- 
scopic and photometric monitoring of supernova samples aimed at calibration and understanding 
of systematics, new iJST searches for z > 1 supernovae, further improvements in Hq determination, 
deeper X-ray and weak lensing studies of s ample s of tens or hundreds of galaxy clusters, and new 
cluster searches via the Sunyaev-Zel'dovich ( 197d ) effect using the South Pole Telescope (SPT), the 
Atacama Cosmology Telescope (ACT), or the Planck satellite. In addition. Stage III analyses will 
draw on primary CMB constraints from Planck. 

The Astro2010 report identifies cosmic acceleration as one of the most pressing questions in 
contemporary astrophysics, and its highest priority recommendations for new ground-based and 



^We follow the convention in the astronomical literature of italicizing the names and acronyms of space missions 
but not of ground-based facilities. For reference, note that the many acronyms that appear in the article are all 
defined in Appendix A, the glossary of acronyms and facilities. 

^Pan-STARRS, the Panoramic Survey Telescope and Rapid Response System, is the acronym of the facility rather 
than the project, but cosmological surveys are among its major goals. Pan-STARRS eventually hopes to use four 
coordinated telescopes, but the surveys currently underway use the first of these telescopes, referred to as PSl. 

®The acronym and facilities glossary gives references to web sites and/or publications that describe these and 
other experiments. 
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space-based facilities both have cosmic acceleration as a primary science theme0 On the ground, 
the Large Synoptic Survey Telescope (LSST), a wide-field 8.4-m optical telescope equipped with 
a 3.2-Gigapixel camera, would enable deep weak lensing and optical cluster surveys over much of 
the sky, synoptic surveys that would detect and measure tens of thousands of supernovae, and 
photometric-redshift BAO surveys extending to z ~ 3.5. BigBOSS, highlighted as an initiative 
that could be supported by the proposed "mid-scale innovation program," would use a highly 
multiplexed fiber spectrograph on the NOAO 4-m telescopes to carry out spectroscopic surveys 
of ~ 10^ galaxies to z w 1.6 and Lya forest BAO measurements at z > 2.2. Another potential 
ground-based method for large volume BAO surveys is radio "intensity mapping," which seeks 
to trace the large scale distribution of neutral hydrogen without resolving the scale of individual 
galaxies. In the longer run, the Square Kilometer Array (SKA) could enable a BAO survey of 
~ 10^ Hl-selected galaxies and weak lensing measurements of ~ 10^*^ star-forming galaxies using 
radio continuum shapes. 

Space observations offer two critical advantages for cosmic acceleration studies: stable high 
resolution imaging over large areas, and vastly higher sensitivity at near-IR wavelengths. (For 
cluster studies, space observations are also the only route to X-ray measurements.) These advan- 
tages inspired the Supernova Acceleration Probe {SNAP), initially designed with a concentration 
on supernova measurements at 0.1 < z < 1.7, and later expanded to include a wide area weak 
lensing surv ey as a major component. Following the National Research Council's Quarks to Cos- 
mos report ([Committee on The Phvsics of The Universel . l2003l ^. NASA and the U.S. Department 
of Energy embarked on plans for a Joint Dark Energy Mission [J DEM), which has considered a 
variety of mission architectures for space-based supernova, weak lensing, and BAO surveys. The 
Astro2010 report endorsed as its highest priority space mission a Wide-Field Infrared Space Tele- 
scope ( WFIRST), which would carry out imaging and dispersive prism spectroscopy in the near-IR 
to support all three methods. The suggested design of WFIRST, a 1.5-m telescope with a large 
near-IR focal plane array, is like that of the JDEM-Omega proposal dOehrelsl . bQld ). but the en- 
dorsed mission scope is considerably broader, including a planetary microlensing program and a 
guest observer program. An updated technical and scientific plan for VKFJ - R^T appears in the 
preliminary report from the WFIRST Science Definition Team ( Green et al. . 201 ll ). The mission 



still faces substantial funding hurdles, so despite its top billing in Astro2010, its future is uncertain. 
On the European side, ESA recently selected the E'ltc/icQ satellite a s a medium-class mi ssion for 
its Cosmic Vision 2015-2025 program, with launch planned for 2019 ( Laureijs et al. . 201 ll ). Euclid 
plans to carry out optical and near-IR imaging and near-IR slitless spectroscopy over half the sky, 
for weak lensing and BAO measurements. Well ahead of either mission, the European X-ray tele- 
scope eROSITA (on the Russian Spectrum Roentgen Gamma satellite) is expected to produce an 
all-sky catalog of ~ 10^ X-ray selected clusters, with X-ray temperature measurements and resolved 
profiles for the brighter clusters]^ 

The completion of the Astro2010 Decadal Survey and the Euclid selection by ESA make this 
an opportune time to review the techniques and prospects for probing cosmic acceleration with 



^We will use the term "Astro2010 report" to refer collectively to New Worlds, New Horizons and to the panel 
reports that supported it. In particular, detailed discussion of these science themes and related facilities can be 
found in the individual reports of the Cosmology and Fundamental Physics (CFP) Science Frontiers Panel and the 
Electromagnetic Observations from Space (EOS), Optical and Infrared Astronomy from the Ground (OIR), and Radio, 
Millimeter, and Sub-Millimeter Astronomy from the Ground (RMS) Program Prioritization Panels. Information on 
all of these reports can be found at |http : //sites .nationalacademies . org/bpa/BPA_049810[ 

*Not an acronym. 

^More detailed description of Euclid and WFIRST can be found in gS^l and of eROSITA in 363] 
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ambitious observation al programs. Our goal is, in some sense, an update of the DETF report 
(|Albrecht et all l2006h . incorporating the many developments in the field over the last few years 
and (the difference between a report and a review) emphasizing explanation rather than recom- 
mendation. We aim to complement other reviews of the field that diffe r in foc us or in level of 
detail . To mention just a selection of these, we note that iFrieman et al.l (j2008al ) and iBlan chard 
provi de excellent overviews of the fi eld, covering theory, c urrent observations, and future 
experiments; Peebles and Ratra ( 20031 ) and Copeland et al. ( 2006) are especially good on history 
of the subject and on theoretical aspects of scalar field models; Jain and Khoury ( 2010l ) review the 
observational and (especially) theor etical aspect s of modified g ravity models in much greater depth 
than we cover here; Carrolll ( 20031 ) and Linder ( 2003b . 200?! ) provide acc essible and in formative 
introductions at the less forbidding length of conference proceedings; and Linder ( 20ld ) provides 
a review aimed at a general scientific audience. The distinctive features of the present review are 
our in-depth discussion of individual observational methods and our new quantitative forecasts for 
how combinations of these methods can constrain parameters of cosmic acceleration theories. 

To the extent that we have a consistent underlying theme, it is the importance of pursuing 
a balanced observational program. We do not believe that all methods or all implementations of 
methods are equal; some approaches have evident systematic limitations that will prevent them 
reaching the sub-percent accuracy level that is needed to make major contributions to the field 
over the next decade, while others would require prohibitively expensive investments to achieve the 
needed statistical precision. However, for a given level of community investment, we think there is 
more to be gained by doing a good job on the three or four most promising methods than by doing 
a perfect job on one at the expense of the others. A balanced approach offers crucial cross-checks 
against systematic errors, takes advantage of complementary information contained in different 
observables or complementary strengths in different redshift ranges, and holds the best chance 
of uncovering "surprises" that do not fit into the conventional categories of theoretical models. 
This philosophy will emerge most clearly in ^ where we present our quantitative forecasts. For 
understandable reasons, most articles and proposals (including some we have written ourselves) 
start from current knowledge and show the impact of adding a particular new experiment. We will 
instead start from a "fiducial program" that assumes ambitious but achievable advances in several 
different methods at once, then consider the impact of strengthening, weakening, or omitting its 
individual elements. 

We expect that different readers will want to approach this lengthy article in different ways. 
For a reader who is new to the field and wants to learn it well, it makes sense to start at the 
beginning and read to the end. A reader interested in a specific method can skim ^to get a sense 
of our notation, then jump to the section that describes that method (Type la supernovae in ^ 
BAO in 21 weak lensing in and clusters in We think that these sections will provide useful 
insights even to experts in the field. Section [7] provides a brief overview of emerging methods that 
could play an important role in future studies. Readers interested mainly in the ways that different 
methods contribute to constraints on cosmic acceleration models and the quantitative forecasts for 
Stage III and Stage IV programs can jump directly to ^ Finally, ^ provides a summary of our 
findings and their implications for experimental programs. 
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2. Observables, Parameterizations, and Methods 



The two top-level questions about cosmic acceleration are: 

1. Does acceleration arise from a breakdown of GR on cosmological scales or from a new energy 
component that exerts repulsive gravity within GR? 

2. If acceleration is caused by a new energy component, is its energy density constant in space 
and time? 

As already discussed in N1.21 the distinction between "modified gravity" and "new energy compo- 
nent" solutions may not be unambiguous. However, the cosmological constant hypothesis makes 
specific, testable predictions, and the combination of GR with relatively simple scalar field models 
predicts testable consistency relations between expansion and structure growth. 

The answer to these questions, or a major step towards an answer, could come from a surprising 
direction: a theoretical breakthrough, a revealing discovery in accelerator experiments, a time- 
variation of a fundamental "constant," or an experimental failure of GR on terrestrial or solar 
system scales (see §7.71 for brief discussion). However, "wait for a breakthrough" is an unsatisfying 
recipe for scientific progress, and there is one clear path forward: measure the history of expansion 
and the growth of structure with increasing precision over an increasing range of redshift and 
lengthscale. 

2.1. Basic Equations 

In GR, the expansion of a homogeneous and isotropic universe is governed by the Friedmann 
equation, which can be written in the form 

= + zf + + zf + + zf + f^,^^^^^ , (3) 

where {1 + z) = is the cosmological redshift and a{t) is the expansion factor relating physical 
separations to comoving separations. The Hubble parameter is H{z) = d/a, and ^Im, ^r, and 
are the present day energy densities of matter, radiation, and a generic form of dark energy 
These are expressed as ratios to the critical energy density required for fiat space geometry 



Mo 

Pcritc^ ' SvrG 



= , Pent = £^ . (4) 



At higher redshifts 

Pcvit{z) H'^{z) 



n^iz) EE -Pii^ = + zf-^, (5) 



where the second equality follows from the scaling Pm{z) = Pm,o x (1 + z)^ and from the definition 
of Pcritiz). In the formulation ([3]), the impact of curvature on expansion is expressed like that of a 
"dynamical" component with scaled energy density 

Jlfc = 1 — — ^r — ^ij>, (6) 

with r^jt = for a spatially flat universe. In a standard cold dark matter scenario, the matter density 
is the sum of the densities of CDM, baryons, and non-relativistic neutrinos, r^rn — ~1~ ~t~ 



^'^We will refer to values of these parameters at z 7^ as Qrn{z), Q^{z), etc. For other quantities (e.g., -ffo), we use 
subscripts to denote values at 2 = 0. When we assume a cosmological constant, we will replace il^ by Qa- 
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In detail, one must beware that the neutrino energy density does not scale as (1 + z)^ at higher 
redshifts, when they are mildly relativistic, and that the clustering of neutrinos on small scales is 
suppressed by their residual thermal velocities. 

There are some routes to direct measurement of H(z), most notably via BAO (see For 
the most part, however, observations constrain H(z) indirectly by measuring the distance-redshift 
rel ation or the history of structure growth. 

provides a compact and pedagogical summary of cosmological distance measures. 



The comoving line-of-sight distance to an object at redshift z is 

Defining a dimensional (length"^) curvature parameter 

K = -Qkic/Ho)-^ (8) 

allows us to write the comoving angular diameter distance 1^ relating an object's comoving size / 
to its angular size 6 = I/Da, as 

DA{z)=K-^l'^s\n(K^I'^Dc] , (9) 



which applies for either sign of ilfcO Noting that observations imply ^ 1, we can Taylor 
expand equation Q to write 



Da{z) « Dc 



1 f Dc 



6 \c/H( 



(10) 



which also yields the correct result Da = Dc for 0.^ = 0. Note that positive space curvature 
(r^tot > 1, -ft^ > 0) corresponds to negative 0^, hence a smaller Da and larger angular size than 
in a flat universe. If U(f,{z) > U(f,fi then the Hubble parameter at z > is higher compared to a 
cosmological constant model with the same matter density and curvature (eq. [3|) , and distances to 
redshifts z > are lower (eq. [9]) . 

The luminosity distance relating an object's bolometric flux /boi to its bolometric luminosity 
Lboi is 

Dl = v'^boi/47r/boi = Dax{1 + z) . (11) 

The relation between luminosity and angular diameter distance is independent of cosmology, so the 
two measures contain the same information about H{z) and $7^. For this reason, we will sometimes 
use D{z) to stand in generically for either of these transverse distance measures. Some methods 
(e.g., counts of galaxy clusters) effectively probe the comoving volume element that relates solid 
angle and redshift intervals to comoving volume Vc- We will denote this quantity 

dVc{z) = cH~\z)D\{z)dndz. (12) 

On large scales, the gravitational evolution of fluctuations in pressureless dark matter follows 
linear perturbation theory, according to which 

Pm{t) Giti) 



^^Note that iHogd (|l999t ) refers to this quantity as the comoving transverse distance and uses Da to denote the 
quantity relating physical size to angular size. 
^^Recall that sin(ix) = isinh(a;). 
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where tj is an arbitrarily chosen initial time, the linear growth function G{t) obeys the differential 
equation 



Ggr + 2Hiz)GGR - -n^Hiil + zfGcR = 



(14) 



and the GR subscript denotes the fact that this equation applies in standard GRcj The solution to 
this equation can only be written in integral form for specific forms of H{z), and thus for specific 
dark energy models specifying U(f,{z). However, to a very good approximation the logarithmic 
growth rate of linear perturbations in GR is 

din Ggr 



fGR{z) 



din a 



[0^,(2:)]" 



(15) 



wher e 7 ~ 0.55—0.6 depends only weakly on cosmological parameters (Peebles, 1980l : Lightman and Schechter . 
I990I ). Integrating this equation yields 



G'gr(^) 
Ggr{z = 0) 

where ^m{z) is given by equation ([5 



exp 



dz' 



1 + z- 



(16) 



Linderl (|200,5l ) shows that equation (I16p is accurate to better 



than 0.5% for a wide variety of dark energy models if one adopts 

7 = 0.55 + 0.05[1 +t(;(z = 1)] (17) 

(see also Wang and Steinhardt 19981 : Weinberg 20051 : Amendola et al. 20051 ) . While the full solution 
of equation should be used for high accuracy calculations, equation p6]) is useful for intuition 
and for approximate calculations. Note in particular that if u^{z) > u^^q then, relative to a 
cosmological constant model, ^m{z) oc H~'^{z) is lower (eq. [5|), so GGYi{z)/GGR{z = 0) is higher — 
i.e., there has been less growth of structure between redshift z and the present day because matter 
has been a smaller fraction of the total density over that time. It is often useful to refer the growth 
factor not to its z = value but to the value at some high redshift when, in typical models, dark 
energy is dynamically negligible and ^Irniz) ~ 1. We will frequently use z = 9 as a reference epoch, 
in which case equation ()16p becomes 

ggr{z) r dz' 



In the limit ^}fn{z) 
to a{t). 



Ggr{z = 9) 
1, Ggr(z) oc (1+z) 



exp 



1 + Z' 



-[nm{z')Y 



(18) 



i.e., the amplitude of linear fluctuations is proportional 



2.2. Model Parameterizations 

The properties of dark energy influence the observables — H{z), D{z), and G{z) — through 
the history of u^(z)/U(i,fi in the Friedmann equation ([3]). This history is usually framed in terms of 
the value and evolution of the equation-of-state parameter w{z) = p^{z)/u^{z). Provided that the 
field (j) is not transferring energy directly to or from other components (e.g., by decaying into dark 
matter), applying the first law of thermodynamics dU = —pdV to a comoving volume implies 

d{u^a^) = -p^d{a^) (19) 
=^ a^dUfj, + 3U(j,a^da = —^w{z)u^a^da (20) 
=^ d\iiu^ = -2,[l + w{z)]d\iia = Z[l + w{z)]d\ii{l + z) , (21) 



This equation applies on sc ales much sm alle r than the horizon . On scales close to the horizon one must pay careful 
attention to gauge definitions. IYooI (|2009l ') and IYoo et al] (|2009l ') provide a unified and comprehensive discussion of 
the multiple GR effects that influence observable large scale structure on scales approaching the horizon. 
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where the last equaUty uses the definition a = (1 + z) ^. Integrating both sides imphes 



exp 



u^{z = 0) 

For a constant w independent of z we find 

u^{z) 



[l + w{z')] 



dz' 



l + z' 



u^{z = 0) 



(1+^) 



3il+w) 



(22) 



(23) 



which yields the familiar results n oc (1 + z)^ for pressureless matter and ti oc (1 + z)^ for radiation 
{w = +|), and which shows once again that a cosmological constant u^{z) = constant corresponds 
to w = —1. 

The first obvious way to parameterize w{z) is with a Taylor expansion w{z) = Wn + 'w'z+ .... but 



this expansiq r i becom es ill-behaved at high z. A more useful two-parameter model (jChevallier and Polarski 
200ll : iLindeil . l2003al l is 

w{a) = wo + Wa{l-a), (24) 

in which the value of w evolves linearly with scale factor from wq + Wa at small a (high z) to wq at 
z = 0. Observations usually provide the best constraint on w at some intermediate redshift, not at 
z = 0, so statistical errors on wq and Wa are highly correlated. This problem can be circumvented 
by recasting equation ()24p into the equivalent form 



w[a} 



a) 



(25) 



and choosing the "pivot" expansion factor Up so that the observational errors on Wp and Wa are 
uncorrelated (or at least weakly so) . The value of the pivot redshift depends on what data sets are 



being considered, but in practice it is usually close to Zp 



l^OA- 0.5 (see Table E]). The 



best-fit Wp is, approximately, the parameter of the constant-tD model that would best reproduce the 
data. A cosmological constant would be statistically ruled out either if Wp were inconsistent with 
— 1 or if Wa were inconsistent with zero. In practice, error bars on Wa are generally much larger 
than error bars on Wp, by a factor of 5 — 10. More generally, it is much more difficult to detect time 
dependence of w than to show w ^ —1, typically requiring sub-percent me asurements o f obser vables 
even if w changes by order unity in an interval Az < 1 at low redshift ( Kujat et al. . 20021 ). The 
DETF proposed a figure of merit (FoM) for dark energy progra ms based on the expected error 
ellipse in the wq — Wa plane (similar to the approach described bv lHuterer and Turnerl 200l|). We 
will frequently refer to this DETF figure of merit, adopting the definition 



¥oM. = [a {wp)a{wa)\ 



(26) 



and we will refer to dark energy models defined by equations (p4j) or (j25|) as "tfo — Wa models." 

An alternative parameterization approach is to approximate w{z) as a stepwise-constant func- 
tion defined by its values in a number of discrete bins, perhaps with priors or constraints on 
the allowed values (e.g., —1 < w{z) < 1). For a given set of observations, this function can 
then be decomposed into orthogonal principal components (PCs), with the first PC being the 
one that is best constrained b y the data, the second PC the next best constrained, and so forth 
(jHuterer and Starkmanl. l2003l) . Variants of this approach have been widely adopted in rece nt inves- 



tigations (e.g.. lAlbrecht and BernsteinI 120071 : ISarkar et al.ll2008bl : iMortonson et al.l l2009bl) . i nclud- 
ing the report of the JDEM Figure-of-Merit Science Working Croup dAlbrecht et all . boOfll V The 



PC A approach has the advantage of allowing quite general w{z) histories to be represented, though 
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i n pra ctice only a few PCs can be constrained well; Linder and Huterei ( 20051 ) and de Putter and Linder 
( 20081 ) have argued that the wq — Wa parameterization has equal power for practical purposes. We 
will use both characterizations for our forecasts in ^ 

If w 7^ —1, then the dark energy density should display spatial inhomogeneities, but for sim- 
ple scalar field models these inhomogeneities are strongly suppressed on scales below the horizon. 
More complicated models that have a soun d speed (c^ = 6p/5p) much smaller than c allow fluctu- 



ations to grow on sub- horizon scales (e. g., Hu 1998 : Erickson et aL 2002 : Weller and Lewis 20031 : 



DeDeo et al.ll2003l : iBean and Dora l2004l ) . Ide Putter et al. (2010) provide a clear discussion of the 



background physics and observable consequences of dark energy inhomogeneities. In general these 
inhomogeneities are very difficult to detect, because their growth is significant only when w is far 
from —1 and Cs <C c, and because the fluctuations in dark energy density are much smaller than 
those in dark matter. We will mostly ignore dark energy inhomogeneities in this article, though we 
return to the subject briefly in ^7.81 

Our equations so far have assumed that GR is correct. The alternative to dark energy is to 
modify GR in a way that prod uces accelerated expansion. One of the best-studied examples is 
DGP gravity (jPvah et al.l . I2OO0I I. which posits a five-dimensional gravitational field equation that 
leads to a Friedmann equation 



8ttG 



p{z)± 



cH 



(27) 



for a spatially flat, homogeneous universe confined to a (3 -|- l)-dimensional brane. Above the 
"crossover scale" Vc, which relates the five-dimensional and four-dimensional gravitational con- 
stants, the gravitational force law scales as instead of the usual r~^. Choosing the positive 
sign for the second term in equation (|27p and setting Tc ~ c/Hq leads to an initially decelerating 
universe that transitions to accelerating, and ultimately exponential, expansion. Other modifica- 
tions to the gravitational action that replace the curvature scalar R by some function f{R) will 
modi fy the Friedmann equation in different wa ys, some of which can produce late-time acceleration 
(e.g.. lCaDozziello and FM^l2002l : ICarrol]|l2003l l. Alternatively, one can simply postulate a modified 
Friedmann equation without specifying a complete gravitational theory, e.g., by replacing jo on 
the right han d side of oc p with a parameterized function oc g{p) ( Freese and Lewis . 20021 : 
Freesd - liooil ;!. Of course, there is no guarantee that such a function can in fact be derived from a 



self-consistent gravitational theory. 

Using equations ([3]) and ([22]), one can express a modified Friedmann equation in terms of an 
effective time-dependent dark energy equation of state. In this review, we will use w{z) to parame- 
terize the expansion histories of both dark energy and modified gravity theories. Given w{z), H{z) 
and D[z) generally follow from the same set of equations for both types of theories, so observations 
that only probe the geometry of the universe are incapable of distinguishing between the two possi- 
ble explanations of cosmic acceleration. In addition to changing the Friedmann equation, however, 
a modified gravity model may alter the equation (jl4p that relates the growth of structure to the 
expansion history H{z). Therefore, one general approach to testing modified gravity explanations is 
to search for inconsistency between observables that probe H{z) or D{z) and observables that also 
probe the growth function G{z). Some methods effectively measure G{z)/G{z = 0), others measure 
G{z) relative to an amplitude anchored in the CMB, and others measure the logarithmic growth 
index 7 of equation (llSp . For "generic" parameters that describe departures from GR-predicted 
growth, we will use a parameter Gg that characterizes an overall multiplicative offset of the growth 
factor and a parameter A7 that characterizes a change in the fluctuation growth rate. We define 
these parameters in §2.41 below, following our review of CMB anisotropy and large scale structure. 
These parameters serve as useful diagnostics for deviations from GR, but they do not provide a 
complete description of the effects of modified gravity theories. In particular, it is also possible (see 
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^7.7p that modified gravity will cause G{z) to be scale-dependent, or that it will alter the relation 
between gravitational lensing and non-relativistic mass tracers, or that it will reveal its presence 
through a high-precision test on solar system or terrestrial scales. 

The above considerations lead to the following general strategy for probing the physics of cos- 
mic acceleration: use observations to constrain the functions H{z), D{z), and G{z), and use these 
constraints in turn to constrain the history of w{z) for dark energy models and to test for inconsis- 
tencies that could point to a modified gravity explanation. For pure H{z) and D{z) measurements, 
the "nuisance parameters" in such a strategy are the values of and fi^, in addition to parame- 
ters related directly to the observational method itself (e.g., the absolute luminosity of supernovae). 
Assuming a standard radiation content, the value of f]^ = 1 — fi^ — ^r — ^k is fixed once and 
are known. The effects of ilm and ilk are separable both from their different redshift dependence in 
the Friedmann equation ([3]) and from the influence of Ofc on transverse distances (eq. [9]) via space 
curvature. 

2.3. CMB Anisotropies and Large Scale Structure 

CMB anisotropies have little direct constraining power on dark energy, but they play a crit- 
ical role in cosmic acceleration studies because they often provide the strongest constraints on 
nuisance parameters such as Vtm, ^^fc, and the high-redshift normalization of matter fluctuations. 
In particular, the amplitudes of the acoustic peaks in the CMB angular power spectrum depend 
sensitively (and differently) on the matter and baryon densities, and the locations of the peaks 
depend sensitively on spatial curvature. Using CMB constraints necessarily brings in additional 
nuisance parameters such as the spectral index and curvature dris/dlnk of the scalar fluctuation 
spectrum, the amplitude and slope of the tensor (gravitational wave) fluctuation spectrum, the 
post-recombination electron-scattering optical depth r, and the Hubble constant 

h = i?o/(100 km s~^ Mpc"^). (28) 

However, some of these parameters are themselves relevant to cosmic acceleration studies, and 
curre nt CMB measuremen ts yield tight constraints even after marginalizing over many parameters 
(e.g., Komatsu et al. 201 ll ). The strength of these constraints depends significantly on the adopted 



parameter space — for example, current CMB data provide tight constraints on h if one assumes a 
flat universe with a cosmological constant, but these constraints are much weaker if 0,^ and w are 
free parameters. 

CMB data are usually incorporated into dark energy constraints, or forecasts, by adding priors 
on parameters that are then marginalized over in the analysis. We will adopt this strategy in ^ 
using the level of precision forecast for the P/ancA; satellite. However, it is worth noting some rules of 
thumb. For practical purposes, Planck data will give near-perfect determinations of Jlm^^ and 
from the heights of the acoustic peaks, where the /i^ dependence arises because it is the physical 
density that affects the acoustic features, not the density relative to the critical density. "Near- 
perfect" means that marginalizing over the expected uncertainties in O^^^ and fi^/i^ adds little to 
the error bars on dark energy parameters even from ambitious "Stage IV" experiments, relative to 
assuming that they are known perfectly!^ Planck data will also give near-perfect determinations 
of the sound horizon at recombination r<j(2:*), which determines the physical scale of the acoustic 
peaks in the CMB and the scale of BAO in large scale structure (see ^4.ip . Since the angular scale of 
the acoustic peaks is precisely measured, Planck data should also yield a near-perfect determination 



^*However, the effects of Planck-level CMB uncertainties are not completely negligible. For the fiducial Stage IV 
program discussed in 33 fixing Q,mh? and Q,\,h? instead of marginalizing increases the DETF FoM from 664 to 876. 
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of the angular diameter distance to the redshift of recombination, D^, = Dji{z^), where 2* ~ 1091. 
Finally, the amplitude of CMB anisotropies gives a near-perfect determination (after marginalizing 
over the optical depth r, which is constrained by polarization data) of the amplitude of matter 
fluctuations at z*, and thus t hroughout the era in which dark energy (or deviation from GR) is 



negligible. As emphasized by iHul (|2005l : an excellent source for more detailed discussion of CMB 
anisotropies in the context of dark energy constraints), these determinations all depend on the 
assumptions of a standard thermal and recombination history, but the CMB data themselves allow 
tests of these assumptions at the required level of accuracy. CMB data also allow tests of cosmic 
acceleration models via the integrated Sachs- Wolfe (ISW) effect, which we discuss briefly in ^7.8[ 

If primordial matter fluctuations are Gaussian, as predicted by simple inflation models and 
supported by most observational investigations to date, then their statistical properties are fully 
specified by the power spectrum P{k) or its Fourier transform, the two-point correlation function 
^(r). Defining the Fourier transform of the density contrasll^ 

6{k) = J d^re-'^-''6{r), 6{r) = {2Tr)-'^ J d^ke^^'^'dik), (29) 

the power spectrum is defined by 

{6{ky6{k')) = {27rfP{k)6U^-k'), (30) 

where 6^ is a 3-d Dirac-delta function and isotropy guarantees that -P(k) is a function of k = \k\ 
alone. The power spectrum has units of volume, and it is often more intuitive to discuss the 
dimensionless quantity 

A'^(k) = (27r)-3 X A7Tk^P(k) = (31) 

am A; 

which is the contribution to the variance = (5^) of the density contrast per logarithmic interval 
of k. The variance of the density field smoothed with a window Wij(r) of scale R is 

a\R) = l^^fAHk)WUk), (32) 

where the Fourier transform of a top-hat window, Wji{r) = (47ri?^/3)~^0(l — r/R), is 

~ 3 

Wnik) = [sm{kR) - kRcos{kR)] , (33) 

and the Fourier transform of a Gaussian window, Wi?(r) = (27r)~^/^i2~^e~^^/^^^ , is 

WRik) = e-^^^'l-^. (34) 

The correlation function is 



00 



dk ^2(1 \ sin(/cr) 



ar) = (<5(x)<5(x + r)) = / -A\k)^-^. (35) 







In linear perturbation theory, the power spectrum amplitude is proportional to G'^{z), and we 
will take P\\n{k) to refer to the z = normalization when the redshift is not otherwise specified: 

PUk,z) = -^^^^PUk). (36) 



^^A variety of Fourie r conventions flo at around the cosmology literature. Here we adopt the same Fourier conven- 
tions and definitions as iDodelsonI (|2003l ). 



20 



We discuss the normalization of G{z) and Piin(A;) more precisely in ^2.41 below. The evolution of 
P{k) remains close to linear theory for scales k <^ k^i, where 



^•"1 dk 





\k) = 1. 



(37) 



For realistic power spectra, non- linear evolution on small scales does i iot fe e d back to a l ter th e 
linear evolution on large scales (jPeebled . Il980l : IShandarin and Melottl . Il990l : iLittle et al.1 . Il99ll 



However, the shape of the power spectrum does change on scales approaching k^\, in ways that 
can be calculated using N-b ody simulations (Heitmann et al. . 20101 ) or several variants of cosmo- 
logical perturbation theory (jCarlson et al.ll2009l and references therein). Non-linear evolution is a 
significant effect for weak lensing predictions and for the evolution of BAO, as we discuss in the 
corresponding sections below. 

While there are many ways of characterizing the matter distribution in the non-linear regime, 
the two measures that matter the most for our purposes are the mass function and clustering 
bias of dark matter halos. There are several different algorithms for identifying halos in N-body 
simulations, all of them designed to pick out collapsed, gravitationally bound dark matter structures 
in approximate virial equilibrium. It is convenient to express the halo mass function in the form 



dn 



dlnM 



ficr)Pr, 



diner 



dM 



(38) 



where is the variance of the linear density field smoothed with a top-hat filter of mass scale 
M = ^TiR^pm (eqs. [32] and [33|) . To a first approximation, the function /(o") is universal, and the 
effects of power spectrum shape, redshift (and thus power spectrum amplitude), and background 
cosmological model (e.g., and Q\) enter only through determinin g \dlna/dM\ and pm- The 
state-of-the-art numerical investigation is that of [Tinker et al. (l2008l l. who fit a large number of 
N-body simulation results with the functional form 



/(^) = A 



+ 1 



-c/cr2 



(39) 



finding best-fit values A = 0.186, a = 1.47, b = 2.57, c = 1.19 for z = halos, defined to be 
spherical regions centered on density peaks enclosing a mean interior overdensity of 200 times the 
cosmic mean density pm- (Different halo mass defir 



functional form was justified on analytic gro unds bv ISheth and Torme: 



of argument that ultimately traces back to iPress and Schechterl (jl97 



(1999 


), following 


and 


Bond et al.l 



diMil). 



Discussions of the halo population frequently refer to the characteristic mass scale M* , defined by 

a{M*) = 6c = 1.686, (40) 

which sets the location of the exponential cutoff in the Press-Schechter mass function. He re Sr. is 
the linear th eory overdens i ty at which a spherically symmetric perturbation would collapse!^ 

In detail, Tinker et al. ( 20081 ) find that f{a) depends on redshift at the 10-20% level, probably 
because of the dependence of halo mass profiles on Clrniz). At overdensities of ~ 200, the baryon 
fraction in group and cluster mass halos (M > 10^^ M©) is expected to be close to the cosmic mean 



^"^See lGunn and GottJ l|l972D . but note that their argument must be corrected to growing mode initial conditions, as 
is done in standard textbook treatments. The value 5c — 1.686 is derived for Qrn = 1, but the cosmology dependence 
is weak. 
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ratio Ofe/rim, but gas pressure, dissipation, and feedback from star formation and AGN can alter 
this fraction and change baryon density profiles relative to dark matter profiles. We discuss these 
issues further in ^ 

Massive halos are more strongly clustered than the underlying matter distribution because they 
form near high peaks of the initial density field, which ar i se mo re frequently in regions where the 



background density is high (|Kaiserl . \l9M : iBardeen et aP . \imd \. On large scales, the correlation 



function of halos of mass M is a scale-independent multiple of the matter correlation function 
Chhir) = hl{M)Um{r). The halo -mass cross-correlation in this regime is ^hmix) — bh{M)S>mm{f) ■, 
and similar scalings (h\ and hh) hold for the halo power spectrum and halo-mass cross spectru m at 
low k. Analytic arguments suggest a bias factor 

hh{M) = 1 + ■ 



power spectrum ana naio-mass cross sp ectrur 
(|Cole and Kaiserl . Il989l : IMo and Wh"it3 . Il99fil ^ 



ct(M)/5, 



(41) 



There have been numerous refinements to this formula ba sed on analyt i c mod els and numerical 
calibrations. The state-of-the-art numerical study is that of iTinker et"al] (|20ld l. 

Galaxies reside in dark matter halos, and they, too, are biased tracers of the underlying matter 
distribution. Here one must allow for the fact that different kinds of galaxies reside in different 
mass halos and that massive halos host multiple galaxies. More massive or more luminous galaxies 
are more strongly clustered because they reside in more massive halos that have higher bh{M). At 
low redshift, the lar ge scale bias factor is 6^ < 1 for galaxies below the characteris tic cutoff L* of 



thelSchechted (119761) luni i nosity function, but it rises sharply at higher luminosities (jNorberg et al 



2001 



Zehavi et al 



2005, 



201 ih . 



For a galaxy sample defined by a threshold Lmin in optical or near-IR luminosity (or stellar 
mass) , theoretical models and empirical s tudies (too numerous to list comprehens i vely, but our 



summary here is especially influenced by Kravtsov et al. 2004 : Conroy et al. 20061 : Zehavi et al 



201 ll ) suggest the following approximate model. The minimum host halo mass is the one for which 
the comoving space density n{Mram) of halos above Mmm matches the space density n(Linin) of 
galaxies above the luminosity threshold. Each halo above Mmin hosts one central galaxy, and in 



addition each such halo hosts a mean number of satellite galaxies {N, 



sat/ 



(M - M^)/15M, 



with the actual number of satellites drawn from a Poisson distribution with this meanO The large 
scale galaxy bias factor bg is the average bias factor hh{M) of halos above Mmin, with the average 
weighted by the product of the halo space density and the average number of galaxies per halo. In 
addition to increasing bg by giving more weight to high mass halos, satellite galaxies contribute to 
clustering on smal l scale s , where pairs or group s of ga laxies reside in a sing le halo dSeliakl . bood : 



Scoccimarro et al.l . l200ll : iBerlind and Weinberg] . |2002| ). In detail, at high luminosities one must 



allow for scatter between galaxy luminosity and halo mass, which reduces the bias below that of 
the sharp threshold model described above. Furthermore, selecting galaxies by color or spectral 
type alters the relative fractions of central and satellite galaxies; redder, more passive galaxies are 
more strongly clustered because a larger fraction of them are satellites, and the reverse holds for 
bluer galaxies with active star formation. Thus each class of galaxies has its own halo occupation 
distribution (HOD), which describes the probability P{N\M) of finding N galaxies in a halo of 
mass M and specifies any relative bias of galaxies and dark matter within halos. 

On large scales, where b'^g^\JJi, z) <C 1, the galaxy power spectrum should have the same 
shape as the linear matter power spectrum, Pgg{k,z) = bgPun^k, z). However, scale-dependence of 



^^To make the model more accurate, one should adjust Mj^in iteratively so that the total space density of galaxies, 
central+satellite, matches the observed n(Lniin), but this is usually a modest correction because the typical fraction 
of galaxies that are satellites is 5 — 20%. 
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bias at the 1 0-20% level can persist to quite low k, especially for luminous, highly biased galaxy 
populations ( Yoo et al. . 20091 ). Combinations of CMB power spectrum measurements with galaxy 
power spectrum mea surements c an yiel d tighter cosm ological parameter constraints than either 
one m isolation (e.s.. ICole et alJ bo05: R eid et alJlioiol ). In particular, this combination provides 
greater leverage on the Hubble constant h, since CMB-constrained models predict galaxy clustering 
in Mpc while galaxy redshift surveys measure distances in Mpc (or, equivalently, in km s~^). 
Another com plicating factor in galaxy clustering measurements is redshift-space distortion 



(jKaiseii 119871 : see iHamiltonl 119971 for a comprehensive review), which arises because galaxy red- 



shifts measure a combination of distance and peculiar velocity rather than true distance. On small 
scales, velocity dispersions in collapsed objects stretch structures along the line of sight. On large 
scales, coherent inflow to overdense regions compresses them in the line-of-sight direction, and 
coherent outflow from underdense regions stretches them along the line of sight. In linear pertur- 
bation theory, the divergence of the peculiar velocity field is related to the density contrast field 
by 

V • v(x, z) = -(1 + z)-'H{z)^;j^6{^, z)^-{l + z)-'H{z)[nm{zWS{^, z) , (42) 

am a 



with 7 defined by equation (jlSp . The galaxy redshift-space power spectrum in linear theory is 
anisotropic, depending on the angle 6 between the wavevector k and the observer's line of sight as 



Pg{k,^x)=bl{l + (3^,'YP{k), 

where P{k) is the real-space matter power spectrum, = cos0, and 

bg din a bg 



(43) 



(44) 



A variety of non- linear effects, most notably the small scale dispersion and its correlation with 
large scale density, m ean that equation (I43I) is rarely an a dequate approximation in practice, even 
on quite large scales ( Cole et al. . 1994 : Scoccimarrol . 2004j ) . In the galaxy correlation function, one 
can remove the effects of redshift-space distortion straightforwardly by projection, counting galaxy 
pairs as a function of projected separation rather than 3-d redshift-space separation. For the power 
spectrum, one can correct for r edshift-space distortion, but the analysis is more model-dependent 
(see, e.g., Tegmark et al. 20041 ) . However, redshift-space distortion can be an asset as well as a 



nuisance, since it provides a route to measuring dlnG/dlna. We will discuss this idea in ^7.2[ 

2.4- Parameter Dependences and CMB Constraints 

Figure [1] illustrates the four statistics discussed above: the CMB temperature angular power 
spectrum, the matter variance A^^^^k) computed from the linear theory power spectrum at z = 0, 
the z = halo mass function computed from equ ations (j38p and (|39p . and the halo bias factor 
computed from equation (6) of Tinker et al. ( 2010 l) for overdensity 200 halos (relative to the mean 
matter density). Curves in the main panels show a fiducial model with the likelih ood- weighted 



mean parameters for the seven-year WMAP CMB measurements (hereafter WMAP7; iLarson et al 
20 111 ) assuming a flat universe with a cosmological constant: Qc = 0.222, = 0.045, = 0.733, 
h = 0.71, Us = 0.963, r = 0.088, and primordial power spectrum amplitude As{k = 0.002 Mpc~^) = 
2.43 X 10~^. (These parameters also assume no tensor fluctuations and dng/dlnk = 0.) The 
CMB power spectrum shows the familiar pattern of acoustic peaks, with the angular scale of 
the flrst peak corresponding approximately to the sound horizon at recombination divided by the 
angular diameter distance to the last scattering surface. The matter variance Af^^{k) shows a slow 
change of slope starting at A; ~ 0.02/iMpc~^, corresponding to the horizon scale at matter-radiation 
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Figure 1 CMB angular power spectrum (upper left), variance of matter fluctuations (upper right), 
halo mass function (lower left), and halo bias factor (lower right). Solid curves in the main panels 
show predictions of the fiducial ACDM panel listed in Table [H Curves in the lower panels show 
the fractional changes in these statistics induced by changing 1 + tt; to ±0.1 or il^ to ±0.01 (see 
legend). For each parameter change, we keep ^mh"^, Qbh'^, and fixed by adjusting Qrn, ^b, and 
h (see Tabled! . These compensating changes keep deviations in the CMB spectrum minimal, much 
smaller than the cosmic variance errors indicated by the shaded region. 
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Table 1. Fiducial Model and Simple Variants 



w 








$7^ 


h 




—1 n 




0.222 


0.045 


yj* 1 oo 


0.710 


0.806 


—0.9 


0.00 


0.246 


0.050 


0.704 


0.675 


0.774 


-1.1 


0.00 


0.201 


0.041 


0.758 


0.746 


0.837 


-1.0 


0.01 


0.186 


0.038 


0.766 


0.776 


0.809 


-1.0 - 


-0.01 


0.256 


0.052 


0.702 


0.661 


0.802 


Note. - 


- All 


models have = 


0.963, r 


= 0.088, Asik = 


0.002 Mpc 




2.43 X 10 


-9 









equality, and low amplitude wiggles at smaller scales produced by BAO. The halo mass function 
has an approximate power-law form at low masses changing slowly to an exponential cutoff for 
M > M* = 3 X 10^2/j~i jvfQ. The bh{M) relation is roughly flat for M < 5M* before rising steeply 
at higher masses. The /i-dependences used for k, dn/dlnM, and M reflect the dependences that 
typically arise when distances are estimated from redshifts and thus scale as h~^. 

In the lower panels, we show the fractional change in these statistics that arises when changing 
1 + w from to ±0.1 and when changing il^ from to ±0.01. With any parameter variation, there 
is the crucial question of what one holds fixed. For this figure, we have held fixed the parameter 
combinations that have the strongest impact on the CMB power spectrum: O.mh'^ and Q^h"^, which 
determine the heights of the acoustic peaks and the physical scale of the sound horizon, and 
D^: = DA{Zi:), which maps the physical scale of the peaks into the angular scale. We satisfy these 
constraints by allowing h and to vary, maintaining $7^ = for the tD-variations and w = —1 for 
the Ofc-variations, with n<j. As, and r fixed to the fiducial model values. The parameter values for 
these variant models appear in Table [TJ 

Prom the CMB panel, we can see that the changes in the angular power spectrum induced by 
these parameter variations are small compared to the cosmic variance error at every /, since we 
have fixed the parameter combinations that mostly determine the CMB spectrum 1^ The changes 
are coherent, of course, but even considering model fits to the entire CMB spectrum the w changes 
would be undetectable at the level of errors forecast for Planck, while the ilk = ±0.01 models would 
be distinguishable from the fiducial model at about 1.5a. The impact of these parameter changes 
must instead be sought in other statistics at much lower redshifts. Changes to the matter variance 
are ~ 5% at small scales, growing to ~ 20% at large scales, with oscillations that reflect the shift 
in the BAO scale. Fractional changes to the halo space density at fixed mass can be much larger, 
especially at high masses where the halo mass function is steep. We caution, however, that the 
fractional change in mass at fixed abundance is significantly smaller, a point that we emphasize in 
^ The impact of a change in w reverses sign at A/ 6 X 10^^ /i"^ Mq ^ 200M*, where the mass 
function begins to drop sharply. Changes in bias factor at fixed mass are ~ 5% at high masses and 



^^The CMB cosmic variance error is AlnCf^ = [(2/ + l)/2] determined simply by the number of modes on 
the sky at each angular scale 
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Figure 2 Evolution of the Hubble parameter (left) and the comoving angular diameter distance 
(right) for the fiducial ACDM model and for the variant models shown in Figure [TJ Upper panels 
are in absolute units, relevant for BAO, while lower panels show distances in Gpc, relevant for 
supernovae or weak lensing. 



smaller at low masses. 

Figure [2] shows the redshift evolution and parameter sensitivity of the Hubble parameter (eq. [3]) 
and the comoving angular diameter distance (eq. [9|), for the same fiducial model and parameter 
variations used in Figure [H The upper panels show H{z) and Da{z) in absolute units, while the 
lower panels plot them in Gpc units. BAO studies measure in absolute units, but supernova 
studies effectively measure hDA{z) because they are calibrated in the local Hubble flow. Equiva- 
lently, supernova distances are determined in Mpc rather than Mpc. Weak lensing predictions 
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Figure 3 Evolution of the linear growth factor G{z) and growth rate f{z) for the models shown 
in Figure [21 assuming GR. The scaling in the upper left panel removes the {1 + z) evolution that 
would arise in an $7^ = 1 universe and normalizes Ggr(2) to one at z = 9. 

depend on distance ratios rather than absolute distances, so in practice they also constrain hDA{z) 
rather than absolute Da{z). 

In absolute units, model predictions diverge most strongly at z = 0, and the impact of Clk = 
±0.01 is larger than the impact oi 1 + w = ±0.1. The impact of the w change on H{z) reverses sign 
at z ~ 0.6, a consequence of our CMB normalization. Changing w to —0.9 would on its own reduce 
the distance to z^,, and Hq must therefore be lowered to keep fixed. However, with O.mh'^ fixed, 
lower Hq implies a higher Q^, which raises the ratio H{z)/Hq, and at high redshift this effect wins 
out over the lower Hq. At z > 2, Da{z) remains sensitive to $7^ but is insensitive to w, while the 
sensitivity of H{z) to w is roughly flat for 1 < z < 3. In h^^ Mpc units, models converge at z = 
by definition, and the impact oi 1 + w = ±0.1 is generally larger than the impact of 0^ = ±0.01. 
The sensitivity of hDA{z) to parameter changes increases monotonically with increasing redshift, 
growing rapidly until z = 0.5 and flattening beyond z = 1. 

For structure growth, the issues of normalization are more subtle. The normalization of the 
matter power spectrum is known better from CMB anisotropy at z* than it is from local mea- 
surements at z = 0, and this will be still more true in the Planck era. It therefore makes sense to 
anchor the normalization in the CMB, even though the value at z = then depends on cosmological 
parameters. Figure [3] (left panel) plots (1 + z)Ggr(z), where Ggr(z) obeys equation (fT4|l and is 
normalized to unity at z = 9. In most models, dark energy is dynamically negligible at z > 9, 
making the growth from the CMB era up to that epoch independent of dark energy. In an Vtm = 1 
universe, Ggr(z) oc (1±z)~-'^, so the plotted ratio falls below unity when ilm(-z) starts to fall below 
one. For $7^ = 0.01, ^m{z) is below that in our fiducial model (see eqs. [3] and [5]) both because 
of the rifc term in the Friedmann equation and because we lower VLm{z = 0) from 0.27 to 0.22 to 
keep D^, fixed, thus depressing Ggr(z) increasingly towards lower z. For w = —0.9, however, the 
depression of Vtm{z) /^m,{z = 0) from the Friedmann equation is countered by the higher value of 
^m{z = 0) = 0.30 adopted to fix D*, so the depression of Ggr(z) is smaller, and it actually recovers 
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Figure 4 Evolution of the matter fluctuation amplitude for the models shown in Figure [3l char- 
acterized by the rms linear fluctuation in comoving spheres of radius 8h~^ Mpc (left) or 11 Mpc 
(right). All models are normalized to the WMAP7 CMB fluctuation amplitude. 



towards the fiducial value as z approaches zero. The effects on the growth rate f{z) (right panel) 
are similar but stronger, with our adopted parameter changes producing larger deviations from the 
fiducial model and the influence of w actually reversing sign at z < 0.5. 

In practice, observations do not probe the growth factor itself but the amplitude of matter 
clustering, and in this case we must also account for the changing relation between the CMB power 
spectrum and the matter clustering normalization. The left panel of Figure H] plots crsiz) x (1 + z), 
where crsiz) is the rms linear theory density contrast in a sphere of comoving radius 8h~^ Mpc 
(eqs. [32] and [33]) . The right panel instead plots o"ii_abs(-2) x (1 + z), where an^abs refers to a sphere 
of radius 11 Mpc (equivalent to for h = 0.727). At high redshift these curves go flat as 1^^(2^) 
approaches one and the growth rate approaches Ggr(-z) oc (1 + z)~^. In the CMB-matched models 
considered here, the impact of w or changes is complex, since changing these parameters alters 
the best-fit values of Clm and h as well as changing the growth factor directly through equation (fT6|) . 
The values of crs{z) change by 4-5% at all z for 1 + w = ±0.1, but these changes mostly track the 
changes in h. In absolute units, the changes to cii^abs(-z) are < 1%, tracking (by definition) the 
changes in Gq^^{z) shown in Figured For 0,^ = ±0.01, (7s{z) changes by 4-5% at high z but 
converges nearly to the fiducial value at z = 0, while o'ii,abs(-2) shows only 1% differences at high 
z but diverges at low z. 

All of these models have the WMAP7 ( Larson et al. . 201 ll ) normalization of the power spectrum 
of inflationary fluctuations, Ag = 2.43 x 10~^ at comoving scale k = 0.002 Mpc"^ at z = z^ = 1091. 
The primary uncertainty in this normalization is the degeneracy with the electron optical depth r, 
since late-time scattering suppresses the amplitude of the primary CMB anisotropics by a factor 
e~'^ on the scales that determine the normalization. The WMAP7 constraints are r = 0.088 ±0.015 
(Iff), so the associated uncertainty in the matter fluctuation amplitude is 1.5%. (Recall that the 
power spectrum am plitude is oc (t|, so its fractional error is a factor of two larger.) For Planck, 



Holder et al.l (j2003l ) estimate uncertainty dr = 0.01 allowing for complex reionization history, and 
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we use this value in our own forecasts. While there have been some changes in the situation since 
then (the polarized foregrounds at large scales are worse than anticipated, and r is lower than the 
central value from the fi rst-year WMAP results) , this expectation seems broadly consistent with 
more recent studies (e.g. JMortonson and H u. 20081: IColombo and Pierpaoli . 2009) l£| This will likely 
be the limiting factor for comparison of high-redshift (CMB) measurements with low-redshift (e.g., 
WL) measurements of the growth of structure (as opposed to measurements of evolution within 
the observed low-z range), unless other probes of reionization such as 21 cm provide constraints on 
the reionizati on history. 

Following lAlbrecht etHI (|2009l ). we parameterize departures from the GR growth rate by a 
change A7 of the growth index (eq. [T5]l and by an overall amplitude shift Gg that is the ratio of 
the matter fluctuation amplitude at z = 9 to the value that would be predicted by GR given the 
same cosmological parameters and w{z) historyl^ Some caution is required in defining A7, since 
equations p^ - pT|) are not exact, and their in accuracies shou l d not be defined as failures of GR! 
For precise calculations, therefore, we adopt the lAlbrecht et al.l (|2009l ) expressions for growth factor 
evolution: 



G{z) 



/GR(z)(l + A7lnf)„(z)) 



Gg X GcRiz) X exp 



A7 



dz' 



1 + z 



jfGR{z')lnn^{z') 



(45) 
(46) 



where Gq^{z) and /gr(-z) follow the (exact) solution to equation ([Ti 

For practical purposes, one can use our definition of growth parameters to calculate the nor- 
malized linear theory matter power spectrum at redshift z, given an assumed set o f cosmological 
parameter values and a w{z) history, as follows. First, use CAMB ( Lewis et al. . 200d ) or some sim- 
ilar program to compute the normalized linear matter power spectrum at z = 9. Then multiply the 
power spectrum by G"^ (z) / Gqj^{z = 9), with G{z) given by equation (jl6|) and Ggr(-z)/Ggr(^; = 9) 



given by the exact solution to equation (I14p , or by the approximate integral solution (llSp , comput- 
ing H(z) and f^m.(-z) from equations ^ and ^ given the cosmological parameters and w{z). For 
reference, we note that CAMB normalization with WMAP7 data yields, for a flat ACDM model. 



as[z 



9) X (1 + 9) 
9) X (1 + 9) 



1.118 



1.134 



A, 



2.43 X 10-9 
As 

2.43 X 10-9 



1/2 



1/2 



„2(n^-l) 



„2(n^-l) 







0.023 J 


V 0.13 






0.023 ; 


V 0.13 



0.57 



h 



0.71 



0.57 



where the primordial ampli tude A,^ is de f ined a t comoving wavenumber k = 0.002 Mpc-^. This 
formula, similar to that in iHu and JainI (|2004l ). is found by varying the parameters in CAMB 
calculations around the WMAP7 mean values one at a time to evaluate logarithmic derivatives; 
spot checks indicate that it is accurate to 0.2% over the 2a range of the WMAP7 errors, and for 
the range of w and ^Ik variations in Table [TJ For other models, one can use this formula to get 
crs{z = 9) in GR, assuming that the effect of dark energy at z > 9 is negligible, then multiply by 
G{z)/Ggr{z = 9) to get as{z). 

For an analyti c pow'e r spectrum, one can use the approximate formula in equation (25) of 
Eisenstein and Hul ( 19991 ). which includes suppression of small scale power by baryonic effects but 



^^For example, 'Colombo and Pierpaolil (|2009l ) find a-r ^ 0.006, albeit under somewhat optimistic assumptions 
regarding foregrounds and sky cuts. 

^ lAlbrecht et al.l (|2009l ') denote this quantity Go instead of Gg, but we have reserved subscript-0 to refer to 2 = 
quantities. 
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does not incorporate BAO. This paper defines the power spectrum normaUzation in terms of a 
parameter 5h, related to our growth factor and normalization Ag by 



nr. 



Using the appropriate values for our fiducial WMAP7 flat ACDM model [G{z = 0) = 0.76, Qm = 
0.2 67, A, = 2.43 x lO'^ at K nrm = 0.002 Mpc-\ h = 0.71, = 0.963] with this definition of 6h, 
the Eisenstein and Hu ( 19991 ) formula agrees with the result from CAMB to 2% or better except 



at the BAO scales, where deviations are up t o 10%. One can also use this normalization for the 
more complex (but still analytic) formulae of Eisenstein and Hu ( 19981 ). which do include BAO. 
We caution that other papers and books (e.g.. 



Dodelsonll2003l ) have different definitions of 5h- 



There are, of course, degeneracies between the modified gravity parameters Gg and A7 and the 
w{z) history, since both affect structure growth. However, if w{z) is pinned down well by D(z) 
and H{z) measurements, then measurements of matter clustering can be used to constrain Gg and 
A7. The clustering amplitude at a single redshift yields a degenerate combination of these two 
parameters, but measurements at multiple redshifts or direct measurements of the growth rate via 
redshift-space distortions can separate them in principle. Of course, there is no guarantee that a 
modified gravity prediction can be adequately described by Gg and a constant A7, and one might 
more generally consider (in eq. 06]), for example, a functional history 7(2:) analogous to w{z), or 
a direct multiplicative change to the growth rate din G/ din a rather than a change of the growth 
index 7. However, any constraints inconsistent with Gg = 1, A7 = after marginalizing over w^z) 
and cosmological parameters would be suggestive evidence for a breakdown of GR. Even if the 
measurements themselves are convincing, one must be cautious in the interpretation, since apparent 
discrepancies could arise from w{z) histories outside the families considered in marginalization or 
from other violations of the underlying assumptions. To give two examples, "early dark energy" 
that is dynamically significant at high redshift could cause an apparent Gg < 1, and decay of dark 
matter into dark energy could cause an apparent A7 < 0, since the value of r2m(-z)/^m(-z = 0) would 
be higher than in the standard picture. In ^7.71 we discuss other potential signatures of modified 
gravity, such as scale-dependent growth, discrepancy between masses inferred from lensing and 
from non-relativistic tracers, and different accelerations in low and high density environments, and 
we mention other parameterizations that have been used to describe modified gravity models. 



2.5. Overview of Methods 

We conclude our "background" material with a short overview of the methods we will describe 
in detail over the next four sections. 

Observations show that Type la supernovae have a peak luminosity that is tightly correlated 
with the shape of their light curves — supernovae that rise and fall more slowly have higher peak 
luminosity. The intrinsic dispersion around this relation is only about 0.12 mag, allowing each well 
observed supernova to provide an estimated distance with a la uncertainty of about 6%. Surveys 
that detect tens or hundreds of Type la supernovae and measure their light curves and redshifts can 
therefore measure the distance-redshift relation D{z) with high precision. Because the supernova 
luminosity is calibrated mainly by local observations of systems whose distances are inferred from 
their redshifts, supernova surveys effectively measure D{z) in units of Mpc, not in absolute 
units independent of Hq. 

Baryon acoustic oscillations provide an entirely independent way of measuring cosmic distance. 
Sound waves propagating before recombination imprint a characteristic scale on matter clustering, 
which appears as a local enhancement in the correlation funtion at r ~ 150 Mpc. Imaging surveys 
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can detect this feature in the angular clustering of galaxies in bins of photometric redshift, yielding 
the angular diameter distance D(zphot)- A spectroscopic survey over the same volume resolves the 
BAO feature in the line-of-sight direction and thereby yields a more precise Da_{z) measurement. 
Furthermore, measuring the BAO scale in velocity separation allows a direct determination of H[z). 
Other tracers of the matter distribution can also be used to measure BAO. Because the BAO scale 
is known in absolute units (based on straightforward physical calculation and parameter values well 
measured from the CMB), the BAO method measures D{z) in absolute units — Mpc not Mpc 
— so BAO and supernova measurements to the same redshift carry different information. 

The shapes of distant galaxies are distorted by the weak gravitational lensing from matter 
fluctuations along the line of sight. The typical distortion is only ~ 0.5%, much smaller than the 
~ 30% dispersion of intrinsic galaxy ellipticities, but by measuring the correlation of ellipticities as a 
function of angular separation, averaged over many galaxy pairs, one can infer the power spectrum 
of the matter fluctuations producing the lensing. Alternatively, one can measure the average 
elongation of background, lensed galaxies as a function of projected separation from foreground 
lensing galaxies to infer the galaxy-mass correlation function of the foreground sample, which can 
be combined with measurements of galaxy clustering to infer the matter clustering. By measuring 
the projected matter power spectrum for background galaxy samples at different 2, weak lensing 
can constrain the growth function G{z). However, the strength of lensing also depends on distances 
to the sources and lenses, so in practice the weak lensing method constrains combinations of G{z) 
and D{z). 

Clusters of galaxies trace the high end of the halo mass function, typically M > IO^^Mq. 
Observationally, one measures the number of clusters as a function of a mass proxy, which directly 
constrains dn / (din M dVc) , where dn/d\nM is the halo mass function (eq. [38]) and dVc is the 
comoving volume element at the redshift of interest (eq. [T2|) . The mass function at high M is 
sensitive to the amplitude of matter fluctuations, and therefore to G{z), though this information is 
mixed with that in the cosmology dependence of the volume element dVc oc D\H~^ . Clusters can 
be identified in optical/near-IR surveys that find peaks in the galaxy distribution and measure their 
richness, in wide-area X-ray surveys that find extended sources and measure their X-ray luminosity 
and temperature, or in Sunyaev-Zel'dovich (SZ) surveys that find localized CMB decrements and 
measure their depth. The critical step in any cluster cosmology investigation is calibrating the 
relation between halo mass and the survey's cluster observable — richness, luminosity, temperature, 
SZ decrement — so that the mass function can be inferred from (or constrained by) the distribution 
of observables. We will argue in ^ that the most reliable route to such calibration is via weak 
lensing, making wide-area optical or near-IR imaging a necessary component of any high-precision 
cosmic acceleration studies with clusters. 
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3. Type la Supernovae 



3.1. General Principles 

Supernovae (which we will often abbreviate to SN or SNe) are the most straightforward tool 
for studyin g cosmic accelera t ion, and they are the t ool that directly discovered acceleration in the 



first place (jRiess et al 



1998 



Perlmutter et al.lll999l : both using local calibration samples from the 



Calan/Tololo survey. iHamuy et al...l996. ). Type la supernovae, defined observatiq nally by the ab- 
sence of hydrogen and presence of Sill in their early-time spectra ( Filippenkol . 1997 ). are thought to 
arise from thermonuclear explosions of white dwarfs, though the evolutionary sequence or sequences 
that lead to these explosions remains poorly understood. The two broad classes of progenitor models 
are "single degenerate," in which a white dwarf accreting from a binary companion is pushed over 
the Chandrasekhar mass limit, and "double degenerate," in which gravitational radiation causes 
an orbiting pair of white dwarfs to merge and exceed the Chandrase khar mass. The observed 
supernova population could have contributions from both channels 
Type la SN mechanisms). 

To a rough approximation. Type la SNe are stand ard candles, with rms dispersion of ap proxi 



see 



Livio 1999 for a review of 



mately 0.4 magnitudes in V-band at peak luminosity ( Hamuv et al. 1996 : Riess et al. 19961 ). This 



0.4-mag scatter can be sharply reduced using an empirical correlation between peak luminosity and 
light curve shape (LCS) — supernovae with higher peak luminosities decline more slowly after the 
peak. This co rrelation , whic h we will refer to generically as the luminosity-LCS relation, was first 
quantified by IPhillipsI (|l993l ) based on a handful of objects including the archetypes of low and 
high luminosity la supernovae, SN 1991bg and SN 1991T, respectively. Also important to the re- 
finement of distance dete rminations was th e developme nt of corrections for t he correlation between 
SN color and extin ction faiess et al.l.ll996l:E^ipD . 199^:jPhillips et al.l .E999l) and i^-corrections for 
redshifting effects (jKim et al.l . Il996l : iNugent et all 12002^ 7 These were all quickly incorp orated into 
analysis methods such as the Multicol or Light Curve Shap e fMLCS: [Riess et al.lll996l ) technique 
used by the High-z Supernova Search |Schmidt et al.', 'l998') and the stretch-factor formalism used 
by the Supernova Cosmology Project (jPerl mutter et al., 1997,). 

With these co rrections, the dispersion in well measured optical band peak magnitudes is only ~ 
0.12 mag nitudes dfficken et al.ll2009bl : iFolatelh et al.ll2010l ). allowing each well measured supernova 
to provide a luminosity-distance estimate with ~ 6% uncertainty. The diversity of SN la light 
curves is not fully underst ood, and pecu liar SNe la appear to produce ~ 5% non-Gaussian tails 



in the SN la distribution (jLi et al.ll201ll ). For the bulk of the population, the prevailing picture 



is that the progenitor explosions produce varying amounts of Ni^^ , whose radioactivity powers the 
optical luminosity, and th at the correlation of peak luminosity with ligh t curve shape arises from 
radiative transfer effects (IHoefiich et al.l Il996l : iKasen and Woosleyl 120071 ) . Recent studies suggest 
that SN la are truly standard candles in the near-IR, with peak luminosities at rest-frame -fT-band 
(1.6 fiui) that have only ~ 0.1 magnitude rms dispersion independent of light curve sha pe, and with 



little sensitivity to uncertain reddening laws ( Mandel et al. 2009 : Mandel et al. 201 ll ). This small 



dispersion in near-IR peak lumi nosities relati ve to optical is consistent with theoretical expectations 
from radiative transfer models (Kasen, 20061 ) . 

To measure cosmic expansion with Type la SNe, one compares the corrected peak apparent 
magnitudes of distant supernovae to those of local calibrators at 0.03 < z < 0.1, a "sweet spot" 
in which distances inferred from redshifts are insensitive to peculiar velocities and to the assumed 
densities of dark matter and dark energy. Since the distances to the local calibrators are usually 
determined from Hubble expansion, this method gives the luminosity distance in units of 
Mpc. More generally, the SN method yields relative distances in different redshift bins, even if 
one of those bins is not strictly local. The Dl{z) relation is sensitive to dark energy through 
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equations ([7]) and and to space curvature through equations pOj) and pT]) . A measurement 
of A'^ supernovae in a redshift bin with rms observational errors dobs in peak magnitudes yields an 
estimate of Dl{z) with fractional statistical error 



(^?nt + ^obs) 



1/2 



2 X 1.086 X JN 



(50) 



where dint is the rms intrinsic scatter, the factor 1.086 converts from magnitudes to natural loga- 
rithms, and the factor of two converts from flux uncertainty to distance uncertainty. As discussed 
in ^3.41 below, there are many possible sources of systematic uncertainty, including flux calibra- 
tion, corrections for dust extinction, and possible redshift evolution of the supernova population. 
Of these, dust extinction looks like it may ultimately be the most difficult to control at the sub- 
percent level, since even a 0.01-mag E{B — V) color excess corresponds to a 3% suppression of 
y-band flux. This consideration provides strong motivation for focusing Stage IV supernova sur- 
veys on rest-frame near-IR photometry, where dust extinction is a factor of 3 to 8 times smaller 
compared to the optical and where the small scatter in peak luminosities may help minimize any 
evolutionary effects. 

3. 2. The Current State of Play 

Building on the initial discovery of cosmic acceleration, supernova surveys have been a ma- 
jor area of activity in observational cosmology over the last decade. The largest hig h-redshift 
(z 0.4 — 1.0) data sets are those from the ESSENCE survey (jWood-Vasev et al.l 120071 : Narayan 
et al., in prep.; ^200 spectroscop i cally confirmed Type la SNe) and the C FHT Supernova Legacy 
Survey fSNLS: [Xstier et al.l[2006l : IConlev et al.ll201ll : ISullivan et aDl201ll : ~500 spectroscopically 
confirmed Type la SNe in the three-year d ata set SNLS3). At very high redshifts, HST surveys 



(|Riess et al.l . l2004l . l2007l : ISuzuki et al.l . l201lh have yielded ~ 25 Type la SNe at z > 1.0, which con- 
firm the expectation that the universe was decelerating at high redshift and limit possible systematic 
effects from evolution of the supernova population or inter galactic dust extinction. At interrnediate 



redshifts (0.1 < z < 0.4), the SDSS-II supernova survey (jPrieman et all . boOSbl : ISako et all , boosi ) 



has discovered and monitored 500 spectroscopically confirmed Type la SNe, though only the first- 
year data set (103 SNe) has so far been subjected to a full cosmological analysis. Finally, the last 
five years have also seen major efforts to expand the sample of local cali brators and improv e their 



measurements, including rest-frame IR and r e st- frame UV photom etry ( Wood-Vasev et al. , 20081 : 



Stritzinger et al.l . I2OI1I : IContreras et al.l . I2OIOI : iHicken et aP . l2009al ) 



The greatest cosmological utility from SNe la generally comes from the joint use of numerous 
samples that span a wide range in redshift. To limit systematic errors introduced by combining 
disparate SN surveys, it is often valuable to recompile a sample from these surveys as homogeneously 
as possible. This involves applying consistent criteria for inclusion in the sample, light curve 
fitting with a single algorithm, propagation of errors via covariance matrices, consistent use of 
i^-corrections, and so forth. While any such "survey of surveys" is not unique and may not be 
optimal for a specific application, thes e compilatio i is are popul ar because of their ease of use. Recent 
examples includ e the " Gold" sample (Riess et al.l . 2004 . 12007 ). the "Union" and "Union2" sample s 



(|Kowalski et al.l . boosi : lAmanuUah et al.l . boiof Tthe "Constitution" sample dHicken et al.l. l2009al) 



and t he compilation of local, SDSS-II, 
(|201lh . 



SNLS3, and HST supernovae analyzed by IConlev et al 



Figure [5] plots luminosity distance measurements from the Union2 compilation over the model 
predictions shown previously in Figure [2] (multiplied by 1 + z to convert comoving angular diameter 
distance to luminosity distance). The data are in good agreement with the fiducial cosmological 
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Figure 5 Luminosity distance vs. redshift for our fiducial cosmolog ical model (solid curve s , 
perposed on supernova measurements from the Union2 compilation ^Amanullah et al. . 20ld ). The 
lower panel shows residuals from the fiducial model prediction for the SN data, with open circles 
marking medians of the data in Az = 0.2 bins and broken curves showing the CMB-normalized 
variant models described in Tabled! Note that these distances are in Gpc units. 




Figure 6 Constraints from WMAP7 CMB data, Union2 SN data, and the combination of the two, 
in (a) the {Qrn,^A) plane assuming w = —1, (b) the {Qm,w) plane assuming $7^ = 0, and (c) the 
(wo.5,Wa) plane assuming = 0, where wo,^ is the value oi w at z = 0.5. Contours show 68% 
confidence intervals. In contrast to panels (a) and (b), the combined contour in (c) is tighter than 
one would guess from the overlap of the individual contours because the combined data set breaks 
degeneracies among other parameters that are marginalized over when inferring wq,^ and Wa- 



34 



model, and the parameter changes in the bottom panel (fifc = ±0.01, 1+w = ±0.1) are at the border 
of detect ability. (Recall that other parameters are adjusted to reproduce the CMB anisotropy of 
the fiducial model; see Table [TJ) 

Figure [6] illustrates model constraints from th e Union2 supernova da. ta and WMAP7 CMB 
data, which we have computed using CosmoMC (|Lewis and Bridlel . hooi ). We use the Union2 
covariance matrix that includes correlated systematic error contributions. Panel (a) shows the 
(ilrm^A) plane assuming w = —1. CMB and SN constraints are highly complementary in this 
plane because the former are most sensitive to the total energy density {Clm ± ^a) and the latter to 
the difference between the densities of "attractive" matter and "repulsive" dark energy. Together 
the two data sets yield tight constraints in this space, Qm = 0.282 ± 0.037, Q\ = 0.723 ± 0.030, 
consistent with a flat universe. Panel (b) shows the (ilrmw) plane, where we now assume spatial 
flatness and a constant value of w. Here again the SN and CMB data are highly complementary, 
yielding a tight combined constraint il^ = 0.270 ± 0.023, w = —1.007 ± 0.081, consistent with 
a cosmological constant. Panel (c) shows the (wo.5,Wa) plane, where we have adopted the 2- 
parameter dark energy model of equation ([Ml; wo,^ is the value of w at z = 0.5, which is much 
better determined than the value of wq and only weakly correlated with Wa- Here we have assumed 
spatial flatness and marginalized over uncertainty in CMB and SN data provide only weak 
constraints individually in this model space, but the combination still provides a good constraint 
on ^0.5, with the error on wq.s = —1.008 ± 0.132 only degraded by ~ 50% compared to panel (b). 
Constraints on Wa, on the other hand, are very weak. The w and wq.s constraints in panels (b) and 
(c) would degrade substantially if we allowed non-zero il^; with this level of flexibility, one must 
bring in additional data to get useful constraints. However, an Hq or B AO constraint at the level o f 
current measurements is sufficient to remove most of the sensitivity to ftk ( Mortonson et al. . 2010l ). 



Plots and constraints similar to Figures [5] and El app ear in many of the papers cited above. The 



most up-to-date analysis is that of IConlev et al. Hionl), who find w 



;(stat)+°-0^ 

for SNe alone, assuming a flat univers e with constan t w an d marginalizing over Q^. Combining 



0.14 (sys) 



this measurement with other data sets, Sullivan et al. ( 201ll ) find w = — LOlGll^QQ^g in combination 



with 7- year WMAP CMB constraints (similar to the value and error bar quoted above), and w 



-1.061 



-0.069 
-0.068 



after adding BAO and Hq measurements. 



There are several indications that current SN cosmology studies are limited by systematic uncer- 
tainties associated with the linked issues of dust extinction, SN colors, and photometric calibration. 
In any cosmological analysis, one uses the color of a supernova relative to a template expectation 
(derived from a training set) to infer, and correct for, a correlation between color and apparent 



magn itude arising from dust and/or intrinsic color variations. In the analysis of lWood-Vasey et al 



(I2OO7I ). different priors about host galaxy extinction change the inferred value of w by amounts 
comparable to the statistical error. When the ratio of extinction to reddening is treated as a free 
parameter in the cosmological fits, the derived values are typicall y quite far from those measured for 
Galactic interstellar dust, e.g., Ry = Ay /E{B — V) = 1.5 — 2.5 (jHicken et al.ll2009bl : iKessler et al 



2009 



of the Milky Way (iCardelh et al. 



Sullivan et al.ll201ll) instead of th e mean Ry 

19891 ). 



3.1 found in the diffuse interstellar medium 
This difference could be a refiection of different kinds of 
dust along the line of sight to the supernova (e.g., circumstellar dust), but it could also arise from 
intrinsic color differences among SNe la with similar light curve shapes, which would reduce the in- 
ferred Ry if they are assumed to arise from reddening. Suppor ting the latter idea, t he distribution 



of SN colors shows little dependence on host galaxy properties (jKessler et al.l . |2009|; ISullivan et al. 



2OI0I ). while such dependen ce might be expected if the color distribution is strongly affected by 
dust. IChotard et aD (|201lh . using spectroscopic indicators of luminosity in nearby SNe, infer an 
extinction law with Ry = 2.8 ± 0.3, consistent with the Galactic value. 



One of the main surprises in the first-year analysis of the SDSS-II Supernova Survey (IKessler et al 
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20091 ) was the realization that the two main algorithms developed b y other groups for global fit 



ting of SN light curves and cosmological parameters — MLCS2k2 (iJha et al.l 120071 ) and SALT2 



(|Guv et al.1120071 ) — initially gave statistically inconsistent cosmological results {w = —0.76 zt 0.07 



vs. w = —0.96 ± 0.06, quoting only the statistical errors) when applied to the same data sets, 
a discrepancy that persisted even if the SDSS-II data themselves were omitted from the fits. 
Kessler et all jioO^) traced this discrepancy to two factors, one related to calibration data and 



the other to the treatment of SN colors. For the calibration data, ultraviolet flux measurements in 
the local sample from the [/-band appear inconsistent with those from the g-band at only moder- 
ate redshift and suggest a problem with the (observed frame) [/-band calibration!^ This problem 
translates into a difference between fitters because one is trained with [/-band data and the other 
is not. A more subtle difference arises from the determination of the correction to SN brightness 
from color measurements, specifically whether the correlation can be assumed to be independent 
of redshift and survey and whether changes in color are due solely to extinction. While these sys- 
tematic uncertainties will certainly be reduced by larger multi-wavelength data sets and improved 
analysis methods, the experience from these recent studies argues strongly for using rest-frame IR 
photometry in precision cosmological studies to circumvent uncertainties related to extinction. 

3.3. Observational Considerations 

There are several steps to a supernova cosmology campaign: discovery, monitoring, spectro- 
scopic confirmation, and calibration against low redshift samples. In large area surveys, discovery 
and monitoring are usually done together, through repeated imaging of a large field of view in 
multiple bands. A variety of image-differencing techniques can be used to identify SNe (distin- 
guished from other variable objects by their light curves) and measure their magnitudes vs. time. 
As a rule of thumb, a minimum rest-frame cadence of one observation per ~ 5dayS is needed to 
get adequate measurements of light curve shapes and normalizations, such that statistical errors 
are dominated by the intrinsic dispersion of SN luminosities and not by observational errors. The 
required cadence may be somewhat lower in the rest-frame IR, where the dependence on light 
curve shape is weaker, but one must still have enough data points to determine peak luminosity 
accurately. At least two bands are needed to measure SN colors and thereby infer dust extinction, 
though more are better, and multiple colors may prove critical to distinguishing different forms of 
extinction (interstellar, circumstellar, and intergalactic) from each other and from intrinsic color 
differences. 

Figure El based on Table 7 of lTonrv et Zl jiool), plots the peak apparent magnitude of a typical 



Type la supernova vs. redshift in observed frame / and J band. As a rough rule of thumb, a survey 
with periodic and uniform exposures targeting supernovae at a given redshift should measure to a 
signal-to-noise ratio of ~ 15 at peak, so that it still usefully measures the SN before or after peak 
when it is 1.5 magnitudes fainter. This depth ensures that incompleteness for supernovae below 
the median luminosity does not bias the results and that photometric errors do not dominate over 
intrinsic scatter in cosmological analysis. Ground-based surveys designed to observe SNe la to 
z < 0.8 will typically find ~ 10 SNe la per square degree per month. 

After discovering SNe, one must determine their type and redshift. The most reliable approach 
is to obtain their spectra to cross-correlate their spectral features with known templates. Spectral 
resolution R ~ 250 and S/N~ 5 per resolution element are adequate for these purposes, but even 



Conlev et al.l (2011) provide further evidence for an error in the local [/-band calibration, and they omit these 



data from their cosmological analysis. 

Observed- frame time intervals are larger hy 1 + z. 
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Figure 7 Peak apparent magnitude of a typical Type la super nova as a function of redshift in 
observed-frame I-band (solid) or J-band (dotted), from Table 7 of Tonrv et al. ( 20031 ). The z > 1.1 
portion of the /-band curve and z < 0.4 portion of the J-band curve rely on extrapolation of the 
template systems' spectral energy distributions beyond the observed range. Magnitudes are on the 
Vega system. 



at this level spectroscopic follow-up is typically the most resource intensive step of a supernova 
campaign. For the same telescope aperture, an epoch of spectroscopy requires an order of magni- 
tude more time than an epoch of photometry, and one generally loses the parallelism afforded by 
photometric monitoring with a large camera (which has several SNe per field of view at a given 
time). Spectroscopic follow-up of the SNLS3 sample, for example, used more than 1600 hours of 
8 — 10m telescope time (M. Sullivan, private communication). 

In principle, photometric redshifts can be used in place of spectroscopic redshifts, and if they 
are accurate to a fractional distance error AD/D < 10% they lead to only moderate degradation in 
statistical accuracy. However, given the degeneracies among redshift, SN color, and dust extinction, 
and the increased ambition of SN surveys to control systematics, we are skeptical that cosmological 
SN surveys can achieve the desired accuracy using only broad-band photometric monitoring and 
spectroscopic follow-up of a small fraction of the sample. An intermediate approach that may work 
would be to measure the cross-correlation of a supernova SED with the SN la spectral features 
using cus tom-designed optica l filters that are matched to SN spectroscopic features at different 
redshifts (jScolnic et al.ll2009l ^. It also may be possible to make use of subsamples of SNe found 



in passive (non-star-forming) galaxies, which should host only Type la SNe and which allow more 
accurate photometric redshifts from host galaxies. For type identification, one can also check for a 
second peak in the rest frame infrared light curve, a morphological feature that is unique to SNe 
la. 

Another intermediate approach is to obtain eventual spectroscopic observations of all host 
galaxies in the cosmological analysis sample but not attempt real-time spectroscopy of all candidate 
Type la supernovae. This scheme still yields precise redshifts, and it provides host galaxy data 
that can be used to measure and remove correlations between supernova and host galaxy properties 
(see ^3.4p . While it still requires one faint-object spectrum per supernova, the scheduling demands 
are much more flexible. One can also apply data quality and other selection cuts before the 
spectroscropic observations to reduce the total number of spectra required, though one must be 
careful not to let biases creep in at this stage. With good photometric monitoring and with 
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subsequent spectroscopic redshifts of apparent hosts, Kessler et al. ( 20ld ) fin d that they can identify 
Type la SNe with 70% to 90% confidence from the LCS and color alone, and lBernstein et al.l ( 20 111 ) 
forecast Type la purity as high as 98% for DES-like photometric observations. With this approach, 
a moderate amount of real-time supernova spectroscopy might suffice to assess efficiency and biases. 

3.4- Systematic Uncertainties and Strategies for Amelioration 

The largest current supernova surveys have ~ 500 Type la supernovae. Future surveys hope to 
discover and monitor thousands of supernovae, sufficient to yield statistical errors of 0.01 mag or 
smaller in narrow redshift bins with Az ~ 0.1 — 0.2. Realizing the statistical power of such surveys 
will require eliminating or limiting several distinct sources of systematic error. These include flux 
calibration errors across a wide range of flux and redshift, the systematics associated with SN 
colors and dust extinction, the possible evolution of the supernova population with redshift, and 
gravitational lensing. We discuss each of these issues in turnip 

The leverage of SN studies comes from comparing SNe over a wide span of redshift and thus an 
enormous range of flux; for example, the typical peak /-band magnitude at z = 0.8 is 23 mag while 
the median peak i?-band magnitude of the local calibrator sample used in many analyses is 17 mag, 
implying a ratio of 250 in flux. Maintaining sub-percent accuracy in relative flux calibration over 
such a range would be challenging under any circumstances, and for SN surveys it is complicated by 
the fact that (a) local and distant SNe are usually observed with different telescopes equipped with 
different filters, (b) a given observed-frame filter intercepts a different portion of the SN rest-frame 
spectral energy distribution (SED) at each redshift, and (c) supernova SEDs are very different 



from those of the standard stars used for flux calibration in most of astronomy. IConley et al 
(|201lh identify calibration as the dominant systematic in SNLS3, the only systematic in their 
analysis that makes a major contribution to their total error budget. Flux calibration uncertainties 
can be reduced by carefully designing photometric SN sur veys with specialized ha rdware (e.g., 
tunable lasers, NIST photodiodes and calibration sources; IStubbs and Tonrvl l200fil ) to measure 
the system throughput in situ and by choosing filter systems that provide a good match in rest- 
frame SED sampling between low- and high-redshift samples. The ACCESS rocket program should 
impro ve flux calibration with sub-orbital flights that compare NIST photodiodes to calibration 
stars ( Kaiser et al. 2O10l ). "Self-calibrati on" that marginalizes o ver flux-calibration uncertainty 
can further reduce this systematic error dKim and Miauej . l2006l ). but at the price of increasing 



statistical error. 

As already noted in §3.21 uncertainties in dust extinction, linked to uncertainties in intrinsic 
SN colors and in photometric calibration, are already important systematics in SN studies of 
cosmic acceleration. These uncertainties can likely be reduced with detailed, well calibrated, multi- 
wavelength observations of large numbers of low redshift SNe, which can characterize the separate 
dependence of SN colors on luminosity, light curve shape, and time since explosion, and provide 
constraints on dust extinction laws that are i solated from cosmologi cal inferences. The final analyses 
of data from the SDSS-II supern ova survey dFVieman et all . hoOHYh and the low-redshift portion of 
the Carnegie Supernova Project ( Hamuv et al.l . bood ) should allow advances on this front. Analysis 
techniques that eliminate the most highly reddened SNe can also reduce extinction systematics 
if they can be applied in a way that does not introduce selection biases; as an extreme example, 
one can employ only SNe in early-type galactic hosts, which have low amounts of interstellar 
dust. Perhaps the most important strategy for reducing extinction systematics is to work as far as 



•^^For detailed discus si ons of systematics in the context of sp ecific contemporary data sets, 
IWood-Vasev et al.l (|2007l ) , iKessler et al.l (|2010l ) and lConlev et al.l l|201ll ). 



see, e.g.; 
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possible to red/near- IR rest-frame wavelengths, where extinction is low compared to blue/visual 
wavelengths. Most ground-based SN cosmology studies to date work at rest-frame B (0.4 — 0.5 //m) 
or V (0.5 — 0.6 fim) wavelengths, which transform to observed-frame /-band ( 0.7 — 0.9 //m) at 
z w 0.5 — 0.8. The high-redshift portion of the Carnegie Supernova Project (iFreedman et al 



20091 ) produced a SN Hubble diagram to z ~ 0.7 in rest-frame /-band, wh ere systematic error s 
due to uncertainty in the reddening laws are roughly half that at 1^-band. Mandel et al. (j2009l ) 
find that the intrinsic dispersion of peak luminosities is only ~ 0.11 mag at rest-frame //-band 
(1.5 — 1.7/im), where systematics due to extinction are only ~ 1/6 that at F-band. However, 
obtaining rest-frame near-IR photometry for high-redshift supernovae requires space observations 
due to the high backgrounds seen from the ground ( §3.5p . 

Locally observed SNe span a wide range in the age, metallicity, and current star formation rate 
(SFR) of their host stellar populations. This breadth of host conditions provides a laboratory for the 
investigation of the evolution of SNe la as distance indicators. Recently such an effect was found and 
calibrated in the form of a modest, 0.03 mag dex~^ relationship between host gal a xy stellar mass 



a likely tracer of rnetalli city) and calibrat e d SN l a magnitude (jKelly et al.ll2010l : iLampeitl et al 



2O10l : ISullivan et aPboid : see iHicken et all (|2009bl l for an analysis with host morph ology). At the 



level of precision enabled by current surveys, it is necessary to correct for this effect (jConley et al 
201 ll ). but the uncertainty in the correction is not a limiting systematic. 

Constraining evolutionary effects to a tenth of dint (~ 0.01 mag) or better is a challenge. For 
example, if there are two populations of Type la progenitors (e.g., single and double degener- 
ates) that have slightly shifted luminosity-LCS relations, then evolution in t he population rati o 



could produce evolution in the mean relation at a fraction of (Tint (see, e.g., ISarkar et al.l |200£ 
A strategy for limiting evolution systematics is to break the SN sample into subsets defined by 
spectral features, light curve shapes, or host properties and check for consistency of cosmological 
result s, since evolutio n is u nlikely to affect all populations in the same way. A complementary 



path ( Riess and Liviol . 20061 ) is to observe supernovae at z > 2, where predicted fluxes relative to 



low-redshift samples are generally insensitive to dark energy parameters; discrepancies would be 
an indication of evolutionary effects or of unconventional dark energy models that could be tested 
by other probes. Finally, we note that any evolutionary corrections may be weaker in the near-IR, 
both because of the narrower range of luminosities and because of the weaker sensitivity to metal 
lines (which may itself contribute to the narrower luminosity range) and reddening laws. 

Gravitational lensing by intervening large scale structure introd uces scatter in observed SN 
fluxes, at a level of ~ 0.05 magnitudes for sources at z = 1 fe.g.. iFriemm] Il996l : IWand Il999l ). 



Flux conservation guarantees that the mean flux of the SN population does not change. However, 
some care is required to ensure that selection effects or weighting schemes do not bias results at 
specially as the magnification distribution is highly non-Gaussian (see, e.g.. 
Since lensing effects are small and calculable, they are unlikely to become a 
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Sarkar et al. 
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limiting systematic even for the most ambitious future surveys. 

If rest-frame near-IR photometry can be obtained for large supernova samples, we anticipate 
that flux calibration uncertainties will ultimately set the floor on systematics. A detailed recent 
i nvestigation of the /TST WFC3-IR system implies a limiting calibration uncertainty of ~ 0.02 mag 
(|Riessl . l2012l ). A future mission designed with IR supernova photometry as a key goal could presum- 
ably do better, so 0.005 — 0.02 mag seems a plausible bracket for calibration-limited systematics. 

3.5. Space vs. Ground 

Space observations offer several key advantages for precision supernova cosmolo gy, a point em- 



phas ized early on by the SNAP (SuperNova Acceleration Probe) collaboration (e.g.. lAldering et al 
2002). The flrst is the sharp and stable point-spread function (PSF) achievable from space, which 
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greatly increases sensitivity to faint, variable point sources and the precision and accuracy of point- 
source photometry, especially in the presence of a host galaxy background. Adaptive optics can 
produce a sharp PSF from the ground, but it is not likely to deliver photometry with 1% precision 
and an image stable enough to allow host subtraction at random positions on the sky away from 
bright guide stars. The second advantage is the greater accuracy and precision of flux calibration 
achievable from space, with no time-variable atmospheric conditions and (for a well chosen orbit) 
minimal variations in the telescope environment. The third is the vastly lower sky background in the 
near-IR. Typical sky backgrounds for ground-based observations are 16, 14, and 13 mag arcsec"^ 
at J, H, and K (Vega), while in space they are 6 to 8 mags fainter, limited by the zodiacal light. 

It is the last of these advantages that we regard as critical — no improvements in ground- 
based technology or observing strategy will ever remove the IR sky background. We have already 
emphasized the key role of rest-frame near-IR photometry in reducing systematics associated with 
dust extinction, and possibly with evolution. Obtaining rest-frame J-band (1.2 //m) photometry of 
SNe at z = 0.8 requires imaging at A = 2 ^m. A 1.5-m space telescope — the aperture proposed 
for WFIRST — can make a S/N=15 measurement at the peak magnitude of a median z = 0.8 
supernova at this wavelength in about 20 minutes. A ground-based 4-m telescope with 0.8" seeing 
and a typical IR sky background would require multiple nights, and even then the accuracy of 
photometry would be compromised by variable sky background. 

A space-based near-IR telescope also offers the option of discovering and monitoring SNe at 
substantially higher redshifts, while working at shorter rest-frame wavelengths. However, for the 
reasons discussed quantitatively in ^ and ^ we think that the most important role for a mission 
like WFIRST in SN studies is to provide the highest achievable accuracy and precision at z < 0.8, 
as part of a combined dark energy program that also includes ambitious BAO and weak lensing 
surveys. At low redshifts, SNe can achieve a measurement precision unmatched by other methods, 
but at higher redshifts they cannot match the dark energy sensitivity of large BAO surveys unless 
they can push statistical and systematic errors well below 0.01 mag (see Table[6]in §8.2p . Campaigns 
to detect and monitor high-redshift {z > 0.8) SNe would be justified if they yield more leverage on 
dark energy parameters than other applications of the same observing time (i.e., to weak lensing 
or BAO surveys). For a given observing allocation, the maximally efficient use of WFIRST SN 
time may be in a combined ground-space program, with ground-based photometry (in rest-frame 
optical) providing high-cadence light-curve sampling and color measurements and lower cadence 
space observations providing the critical, well calibrated, dust-insensitive photometry used for the 
SN distance determinations. 



3. 6. Prospects 

The next year or two should see the publication of final results from the SDSS-II supernova 
survey, the five-year SNLS sample, and ESSENCE. The measurements from these large surveys 
should substantially reduce the statistical errors in the SN Hubble diagram. Perhaps more impor- 
tantly, they should yield significant reductions of systematic errors because of their high sampling 
cadence, wide wavelength range, and greater attention to photometric calibration. Large campaigns 
to discover and monitor local supernovae (e.g., PTE, LOSS, CSP, SN Factory) should also yield 
better understanding of potential systematics, as well as better local calibration. A new HST sur- 
vey by the Higher-z Team using WFC3 will find more high redshift {z > 1.5) SNe, which provide 
additional leverage on the Hubble diagram and constraints on evolution. 

The largest new projects on the near horizon are the SN sur veys of PSl (now underway) and 
DES (beginning observations in late 2012). iBernstein eiTaP (|201lh discuss the DES strategy in some 
detail and forecast discovery of up to 4000 Type la SNe out to redshift z = 1.2. For spectroscopic 
follow-up, DES aims to observe ~ 10 — 20% of their high-z supernovae but obtain nearly complete 
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spectroscopic host galaxy redshifts for their cosmological sample. A similarly detailed description of 
the PSl strategy is not yet available, but in principle PSl should also be able to discover thousands 
of Type la SNe. In purely statistical terms, a sample of 2000 SNe out to z = 0.8 can achieve errors 
of 0.007 mag in redshift bins of Az = 0.2, so both PSl and DES will almost certainly be limited 
by systematic rather than statistical errors. 

Looking f urther ahead, LSST is expecte d to yield samples of tens or even hundreds of thousands 
of SNe ((LSST Science Collaboration , boogl l. These photometric samples will certainly swamp spec- 
troscopic follow-up capabilities, and the LSST surveys will again be systematics limited, though the 
enormous sample size (allowing cross-checks and focus on the most favorable subsamples) and the 
high-cadence monitoring with high photometric precision across the optical spectrum should reduce 
systematics below those of PSl and DES. Finally, if WFIRST is completed and launched as per 
the Astro2010 recommendations, the access to the rest-frame near-IR should yield an unmatchable 
advantage for SN cosmology and the best achievable results in SN dark energy studies. 
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4. Baryon Acoustic Oscillations 



4.1. General Principles 

The baryon acoustic oscillation method relies on the imprint left by sound waves in the early 
universe to provide a feature of known size in the late-time clustering of matter and galaxies. 
By measuring this acoustic scale at a variety of redshifts, one can infer -D^(z) and H{z). The 
acoustic length scale can be computed as the comoving distance that the sound waves could travel 
from the Big Bang unti l recombination at z = (see descriptions by Hu and Sugivamal . E996. : 
Eisenstein and Hu . 19981 ). This is a simple integral 



fflM,,. (51) 



_ l + z J^^ H{z 

The behavior of H{z) at z > z^, depends on the ratio of the matter density to radiation density; in 
simple cosmologies, the radiation sector (photons and neutrinos) is fixed and the ratio is propor- 
tional to ^mh'^ . The sound speed depends on the ratio of radiation pressure to the energy density 
of the baryon-photon fluid, determined by the baryon-to-photon ratio, which is proportional to 
Both the matter-to-radiation ratio and the baryon-to-photon ratio are well measured by 
the relative heights of the acoustic peaks in the CMB anisotropy power spectrum. Analyses of 
W MAP data in the us ual ACDM cosmological models gives a 1.1% inference of the acoustic scale 
(jjarosik et al.l . l201lh : Planck is expected to shrink this error bar to 0.25%. Note that the acoustic 



r 



scale is determined in absolute units, Mpc not Mpc. 

The acoustic scale is large, about 150 Mpc comoving, because primordial sound waves travel at 
relativistic speed, maxing out at c/ \/3 at early times when the baryon density is negligible compared 
to radiation density. The large size of the acoustic scale protects this clustering feature from non- 
linear structure formation in the low-redshift universe. As discussed below, both cosmological 
perturbation theory and numerical simulations argue that the scale of the acoustic feature is stable 
to better than 1% accuracy, making it an excellent standard ruler. The BAO method measures the 
cosmic distance scale using this ruler. Separations along the line of sight correspond to differences 
in redshift that depend on the Hubble parameter H{z)rs- Separations transverse to the line of sight 
correspond to differences in angle that depend on the angular diameter distance DA{z)/rs. 

The challenge of the BAO method is primarily statistical: because this is a weak signal at a large 
scale, one needs to map enormous volumes of the universe to detect the BAO and obtain a precise 
distance measurement. Galaxy redshift surveys allow us to make these large three-dimensional 
maps of the universe, although we will discuss other methods as well. 

At low redshift {z < 0.5), the BAO method strongly complements SN measurements because 
BAO provides an absolute distance scale and a strong connection to the CMB acoustic peaks from 
z = 1000, while SN allow more precise measurements of relative distances and thus offer a more 
fine-grained view of the distance-redshift relation. At higher redshift [z > 0.5), the large cosmic 
volume and the direct access to H{z) make the BAO method an exceptionally powerful probe of 
dark energy and cosmic geometry. 

4-2. The Current State of Play 

The acoustic oscillation phenomenon was identified as a potential effect in the C MB sky in 



the late 1960s. This was s o on extended to the late-time matter power spectrum by (jSakharov 



196fil : IPeebles and~Yul . Il970l : ISunvaev and Zeldovichl . Il97nl ;i: of course, they were considering pure 



baryon cosmologies, where the effect is very strong. The introduction of adiabatic cold dark 
matter in the mid-1980's made the predicted late-time acoustic peak very weak (particularly in 
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Figure 8 (Left) The current BAO distance-redshift relation. Individual measurements are of the 
quantity Dv{z)/rs. We have multiplied by the of the fiducial ACDM model to yield a distance; 
the sound h orizon is pred i cted t o 1.1% from WM AP7. In increasing re dshift, data po ints are from 
the 6d FGS (iBeutler et al.l . l201lh . SDSS+2dFGRS (jPercival et al.l . l2010l ). and WiggleZ dBlake et al. 



201ld ). The latter two papers also quote correlated results from multiple redshift bins; we have 



chosen to plot only a single combined data point for each survey so that the measurement errors 
are uncorrelated. As described in the text, for a fixed choice of w{z) and 0^, CMB data allows a 
prediction for Dy[z)/rs- The flat ACDM prediction from the best-fit WMAP7 model is the black 
line, and the grey region shows the la WMAP7 range. This is not a fit to the data, but rather the 
vanilla ACDM prediction from the CMB data. (Right) The same plot after dividing by the ACDM 
prediction from WMAP7. Also shown are the four alternative models from Table [11 here we have 
suppressed the la range that would surround each line owing to uncerta inties in the matter and 
baryon density. Also shown is the direct Hq value from iRiess et al.l (|201ll ) ; here we have assumed 
perfect knowledge of the sound horizon, which suppresses a 1.1% uncertainty term between this 
value and the BAO points. Both panels are adapted from Mehta et al. (in prep). 



the Q.rn. = 1 scenario) , and t he acoustic oscilllatiqns we r e primarily studied in the CMB context 
( Boiid and Efstathioul . 1984 . 1987 : Jungman et al. . 1996 : Hu and Sugiyama . Il99()l : iHu and Whitd . 
19961 : Hu et al. . 1997). A resurgence of interest in the dynamics of the early universe post- COi?E 
led to the identification of t he acoustic scale as a standard ruler, first in the CMB and then in 
the matter power spectrum ([ Kamion kowski et al.l. Il994l : lJungman et al.1 . Il996l : iHu and Sugivamal . 



19961 : lEisenstein and Hul . 11998 : Meiksin et al. 
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Fisher matrix forecasts for the combina- 
tion of CMB and large-scale structure identified the acoustic oscillations as a c ritical feature in 
breaking the distance scale d egeneracy between and Hp in CMB model fits (jTegmarkl . 119971 : 
Goldberg and Strauss , 1998 : Efstathiou and Bond . 19991: Eisenstein et al.l . Il998 ). In particular, 
the SDSS Luminous Red Galaxy (LRG) sample ( Eisenstein et al.l . l200ll ) was proposed to maximize 
leverage on the large-scale power spectrum, with BAO as one application!^ 

After the discovery of cosmic acceleration with Type la SNe, the focus on the distance scale as a 
function of redshift became intense. In 2003, several papers appeared discussing the ac oustic scale 
as a standard ruler for the measurement of dark energy in higher redshift galaxy surveys (lEisensteinl . 
2OO2I : iBlake and Glazebrookl . I2OO3I : IHu and Hai^M^^] . 120031 : iLindei] . I2OO3I : Iseo and Eisenstein! . l2003h . 



Alex Szalay and James Annis deserve particular credit for leading this development in the early years. 
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Figure 9 Constraints from combinations of current BAO data, WMAP7 CMB data, and Union2 
SN data in (a) the (J^mi^A) plane assuming w = —1, (b) the {Virn-,w) plane assuming il^, = 0, and 
(c) the (u'o.SjW^a) plane assuming $7^ = 0, where u;o.5 is the value of it; at z = 0.5. Contours show 
68% confidence intervals. We omit the CMB+BAO+SN combination from panel (a) because it is 
nearly identical to the CMB+BAO combination. 



Compelling detections in 2005 intensified these plans ( Cole et al. . 2005 : Eisenstein et al. . 20051 ). 
with several observational surveys proposed and numerous theoretical investigations. The rapid 
development of the theor y led to the DETF fea turing BAO as one of the four leading methods for 
the study of dark energy (jAlbrecht et al.l . bood l. 



E arly results from the 2 dF Galaxy Redshift Survey (2dFGRS) (IPercival et"aI1.l200ll : E:fstathiou et~al 



2OO2I : IPercival et aD . I2OO2I ) and the Abeh/ACO cluster sample (IMiller et all . Il999l ) gave hints of 



the acoustic feature in the power spectrum. However, the first convincing d etections of BAO came 



in 2005 from the S PSS Data Releas e 3 (D R3) and final 2dFGRS samples (lEisenstein et al.1 . 1200 



Cole et al.l . l200,5l ). lEisenstein eTHI \2mi \ measured the large-scale correlation function of SDSS 
LRGs in the redshift range 0.16 < 2; < 0.47 over 3,816 square degrees, finding the ac oustic peak with 



3.4 (7 significance. As they only measured the monopole of the correlation function. lEisenstein et al 



S) quoted the distance measurement as a blend of the Ime-of-sight and transverse distance scale 



Dv{z) = [DA{z)f^ 



cz 
Wz) 



1/3 



(52) 



Comparing the size of the acoustic scale in SDSS to that in the CMB sky from WMAP, they inferred 
the value of Dv'(0.35) divided by the distance to z = 1089 with a la uncertainty of 3.7 %. (Recall 
that we use to denote the comoving angular diameter distance.) Cole et al. ( 20051 ) measured 
the power spectrum of 2dFGRS galaxies in the redshift range < z < 0.3 over 1,800 square degrees. 
The cosmological fitting analysis detected a baryon fraction of ^7^/0^ = 0.185 + 0.046, the non-zero 
result indicating a detection of the BAO. The distance precision of the result was quoted as a 4.1% 
measurement of Hq. 

Since these first detections, the clustering of successive ly larger SDSS s pectr oscopic samples has 
been analyzed by several groups using different methods. Tegmark et al. ( 20061 ) analyzed the DR4 
LRG and main galaxy samples with a quadratic estimator for the power spectrum and redshift 



distortion. Hiitsi J 



the Feldman et al 



200fil'l analyzed the mon opole of the power sp ectrum of the LRG data set with 



(Il994l ) (FKP) method. IPercival et aD (|2007l ) apphed the FKP method to the 
combined DR5 LRG and main galaxy samples, along with t he 2dFGRS sa r nple, to measure the 
acoustic scale at two different redshifts (. = 0.20 and 0.35). IPercival et al.l ffl) extended this 
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analysis to the final SDSS-II sample (DR7). Kazin et al. ( 2010bl ) analyzed the DR7 LRG sample 
with the correlation function, achieving consistent results. The r esult of this ongoing effort is an 



aggregate distance precision of 2.7% to . = 0.275 ^Percival et al.1 . B 



New BAO detections have recently been made in two other samples. iBeutler et al.l (120111 ) report 
a 2.4(T detection from the 6-degree Field Galaxy Survey (6dFGS), which covered 17,000 deg^ of sky, 
obtaining a 4.5% di stance measurement to z = 0.1. Stepping beyond z = 0.5, the WiggleZ survey 
(|Blake et al.1 . 12011al l3l has used the AAOmega instrument at the Anglo-Australian Telescope to 
target emission-line galaxies at 0.4 < z < 1.0. The analysis of the final data set of ~ 800 deg^ yields 
BAO detections in three overlapping redshift slices centered on z = 0.44, 0.60, and 0.73, with an 
aggregate precision of 3.8%. 

Combining SDSS, WiggleZ, and 6dFGS, iBlake et al.1 (|2011cl ) achieve a 5a detection of the 
acoustic peak. They also demonstrate good agreement between the SN distance-redshift relation 
and that of the BAO. These three results are displayed in Figure [HI which shows Dy as a function of 
redshift (following Mehta et al., in prep). We can compare this Dv{z) to the relation predicted by 
WMAP7 under particular assumptions about dark energy and spatial curvature. A given value of 
^mh'^ and Ofe/i^ yields a sound horizon r^. For any fixed choice of Q.^ and w{z), the angular acoustic 
scale in the CMB then breaks the Vlm~HQ degeneracy, which then specifies Dy{z). The left panel 
shows the WMAP7 prediction for flat ACDM, with the grey band marginalizing over la errors in 
^mh'^ and Q.^h?' . One can see that the BAO distance measures are in excellent agreement. The 
right panel divides by this prediction and then shows how the comparison to the data would vary 
with non-zero spatial curvature or w ^ —1, using the CMB-normalized models introduced in §2.41 
One can see that small changes, particularly in spatial curvature, make detectable differences in the 
prediction, so that comparison of the data to the prediction allows one to measure w{z) and fJ^. 
Comparing variations in spatial curvature to variations of constant w, one can see that variations 
in spatial curvature produce large offsets but relatively small slopes. SN determinations of relative 
distances can only measure slopes on this graph, whereas absolute distance measurements such 
as BAO can measure the offset. This illustrates why, in fits to the w-Vl^ model, the CMB+SN 
combination tends to measure w better while the CMB+BAO combination tends to measure 
better. 



Com bining these BAO measurements with WMAP7 and the Union-2 SN sample, iBlake et al 
(l2011cl ^ infer = 0.289 ± 0.015, Hq = 68.7 ± 1.9 km s'^ Mpc-\ VIr = -0.004 ± 0.006, and a 
constant w = —1.03 it 0.08. One can think of this inference approximately as CMB acoustic peak 
heights measuring O^/i^, the BAO standard ruler then splitting fi^ and Hq^ the CMB angular 
acoustic scale measuring Q,k-, and the S Ne measuring w. Figur e [9] displays our own constraints 
derived from these data with CosmoMC ([Lewis and Bridld . [2002[ ) , with the same parameter space 
used for the SN and CMB constraints in Figure [H While Figure [6] includes contours for SN 
alone, it makes little sense to consider BAO constraints independent of CMB data because the 
latter are needed to calibrate the BAO ruler. We therefore show contours for CMB, CMB+BAO, 
and CMB+BAO+SN. Consistent with our earlier discussion, CMB+BAO provides much tighter 
constraints on 17^ and 0^ in the w = —1 model than CMB+SN (compare the left panels of Figs. [6| 
and[9|), but CMB+SN provides better constraints on w (middle panels). For awQ — Wa model with 
rjfc = (right panel), the three data sets together yield a good measurement of w{z = 0.5) but still 
only loose constraints on tw^. 



phot o metric redshifts from the SDSS ([Padmanabhan et al 



. 2007: 


Blake et al.. 


2007; 


Crocce et al. 



2011 : Sawangwit et al. . 201 lal ). These analyses produced a 6.5% measureme nt of t he an gular 



diameter distance io z = 0.5. An analysis of the maxBCG cluster catalog by Hiitsi ( 20ld ) also 



yields a 2-2.5 a detection. 
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Other analyses have focused o n the anisotropic BAO signal, with the intent of separating Dji{z) 
and H{z). lOkumura et alj (|2008l ) performed a correlation fun ction analysis of the LR G sample from 
SDSS DR3, achieving a weak indication of the radial BAO. iGaztahaga et al. analyzed the 

correlations of the SDSS LRG sample, considering only pairs very close to the line of sight. They 
claimed a detection of the BAO, thereby measuring H{z); however, the pr oposed acoustic pea k 



is much highe r amp litude than the predicted one and is likely only noise (|Kazin et al.l . l2ninal ) 



Chuang et all (|2O10l ) analyzed the fuU SDSS DR7 LRG sample with an anisotropic correlation 
function, finding separate constraints on Da and H at z = 0.35. 

In summary, the BAO has been found in six different samples — 2dFGRS, SDSS LRG, SDSS 
Main, 6dFGS, WiggleZ, and SDSS photometric — with analyses from several independent research 
teams and with a variety of methods. The best precision is now slightly below 3%, with excellent 
agreement with the ACDM model. 

The next generation of the SDSS large-scale structure survey is BOSS, the Baryon Oscillation 
Spectroscopic Survey of SDSS-IIL BOSS i s observing 1.5 rnillion luminous galaxies (mostly LRGs) 
out to z = 0.7 over 10,000 square degrees teisenstein et al.l . I2OI1I ). with a selection that triples the 
number density of LRGs at z < 0.4 relative to SDSS-II and extends to a new redshift range with 
a dense sample at 0.5 < z < 0.7. The increased sampling should facilitate accurate density-field 
reconstruction ( ^4.3.3p to boost the BAO performance. BOSS is also surveying the 2 < 2: < 3 
universe using a grid of quasar sightlines to provide a 3-dimensional view of the Lya forest, with 
the goal of detect ing BAO in t he lar ge-scale clustering of neutral hyd rogen at z ~ 2.5. This method 
was proposed by White ( 20031 ) and McDonald and Eisenstein ( 20071 ). The clustering of Lya forest 
flux along single lines of sight is well established as a probe of large-scale structure (see ^7.61 for a 
discussion of the underlying theory), and by using cross-correlations among multiple lines of sight 
one can probe 3-dimensional structure. Observationally this method is attractive because quasars 
are very luminous and because each quasar provides ~ 50 measurements of the large-scale density 
field along its line of sight. Since the BAO peak has an intrinsic rms width of ~ 8 Mpc, one need 
only survey the Lya forest at modest resolution (a few hundred) to retain full BAO information. 
Furthermore, one does not need high signal-to-noise ratio spectra, as one gets little gain from 
photon errors smaller than the intrinsic variation in the small-scale forest. BO SS has achieved th e 
first detection of 3-d structure on 10 — 50 Mpc scales in the Lya forest (ISlosar et al.l . I2OIII ). 



and it aims to use 150,000 quasars over 10,000 square degrees to achieve the first measurements of 
BAO at z > 1. 



4.3. Theory of BAO 

While the theory of supernova explosions is complicated, the use of Type la SNe as distance 
indicators rests on empirically determined correlations that are largely independent of that theory. 
With BAO, on the other hand, we are using a standard ruler whose length, imprint on the clustering 
of observable tracers, and even very existence are derived from theory. We therefore review both 
the long-established linear theory of BAO and more recent work on non-linear evolution and galaxy 
bias, and we discuss the implications of this work for analysis of BAO data sets. 

4.3.1. Linear Theory 

Prior to redshifts around 1000, the universe is hot and dense enough that the primordial gas 
is ionized. The free electrons in this plasma provide enough cross-section to the cosmic microwave 
background photons via Thomson scattering to produce a mean free time well less than the Hubble 
time. The result is a close coupling between the electrons, nuclei (baryons), and photons for suffi- 
ciently long wavelength perturbations in the early universe. The radiation pressure of the photons 
is large compared to the gravitational forces in the perturbations, with the result that perturbations 
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Figure 10 The generation of the acoustic peak illustrated via the line ar-theory respon s e to a n 
initially point-like overdensity at the origin; this figure is reproduced from lEisenstein et al. 
Each panel shows the radial perturbed mass profile in each of the four species: dark matter (black), 
baryons (blue) , photons (red) , and neutrinos (green) . The redshift and time after the Big Bang are 
given in each panel. All perturbations are fractional for that species. We have multiplied the radial 
density profile of the perturbation by the square of the radius in order to yield the mass profile. In 
detail, we begin with a compact but smooth profile at the origin, which is why the mass profiles go 
to zero there. As we are using linear theory, the normalization of the amplitude of the perturbation 
(and thus the absolute scale of the y-axis) is arbitrary, a) Near the initial time, the photons and 
baryons are tightly coupled in a spherical traveling wave, b) The outward-going wave of baryons 
and relativistic species increases the perturbation of the cold dark matter, similar to raising a wake, 
c) At recombination, the photons decouple from the baryons. d) With recombination complete, 
the CDM perturbation is near the origin, while the baryonic perturbation is in a shell of 150 Mpc. 
e) With pressure forces now small, baryons and dark matter are attracted to these overdensities 
by gravitational instability, f) Because most of the growth is drawn from the homogeneous bulk, 
the baryon fraction converges toward the cosmic mean at late times. Galaxy formation is favored 
near the origin and at a radius of 150 Mp c. These figures were inade by suitable transforms of 
the t ransfer functions created by CMBfast (jSeljak and Zaldarriagal . Il996l : IZaldarriaga and Seljakl . 
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in th e baryon-photon fluid oscillate as sound waves (jPeebles and Yul . Il970l : ISunvaev and Zeldovichl . 
1970l ). Diffusion of photons relative to baryons damps these o scillations on comoving scales smaller 
than ~ 8h^^ Mpc, the phenomenon known as Silk damping ( Silkl . Il968l ). 



After recombination, the mean free time of the photons in the neutral cosmic gas is long com- 
pared to the Hubble time. The photons decouple from the perturbations in the baryons and soon 
become smoothly distributed. The perturbations in the baryons are now subject to gravitational 
instability, just like the dark matter perturbations. 

As with normal sound waves, one can usefully view the BAO phenomenon from different linear 



basis sets. We first consider the res 



jonse to a densi t y pertu rbati on at a particular initia l loca. - 



Eisenstein et al.l tOQ7\h and lEisenstein and BennettI ( 20081 ) 



tion, as illustrated in Figure [TOl see 
for further description of this view. Primordial perturbations of the adiabatic form predicted by 
standard inflation models consist of equal fractional density contrasts in all species. The dark mat- 
ter perturbation grows in place, slowly at first in the radiation dominated epoch, then faster as the 
universe becomes matter dominated. The baryon-photon perturbation, on the other hand, travels 
away from its origin as a sound wave. At recombination, the baryon part of the wave is left in a 
spherical shell centered on the original perturbation. Both the dark matter at the center and the 
baryons on the shell seed gravitational instability, which grows to form the halos in which galaxies 
form. We therefore expect the distribution of separations of pairs of galaxies (i.e., the two-point 
correlation function generated by such perturbations) to show a small enhancement at the radius 
of the shell, with galaxy concentrations in the central dark matter clumps and in the shells induced 
by the baryons. 



On e ca n equally well view the BA O effect as a standing wave in Fourier space; see lHu and Sugivama 
and lEisenstein and Hul (jl998l ) for this explanation. In Fourier space, the single acoustic scale 
gives rise to a harmonic sequence of oscillations in the power spectrum. This is easy to understand 
physically. The power spectrum encodes the response of the universe to a plane wave perturbation. 
Each crest in the initial wave produces a planar sound wave that travels a distance equal to the 
acoustic scale. If the wave deposits the baryon perturbation on another crest of the dark matter 
perturbation, then one gets constructive interference; if the sound wave ends in a dark matter 
trough, one gets destructive interference. The result is a harmonic relation between the wavelength 
of the perturbation and the acoustic scale. 

Mathematically, this correspondence can be seen by considering that the correlation function 
and power spectrum are Fourier transform pairs. The Fourier transform of a delta function is a 
sinusoid, and the smearing of a delta function simply provides a damping envelope to that sinusoid. 
In the case of the BAO, this smearing is largely due to Silk damping in the early universe and 
to non-linear structure formation at late times. Both cause the higher harmonics in the power 
spectrum to be reduced in amplitude or washed out. 

While it is secondary to our pedagogical thread, we end with some additional discussion of Figure 
[To] and the evolution of the initial point-like density perturbation. First, because the perturbation 
is in the growing mode, only the density perturbation is localized. The velocity perturbation 
away from the initial density perturbation has zero divergence but is non-zero; hence it scales as 
at large radius. As the baryon-photon and neutrino pulses expand, the gravity interior to 
the shell is weaker than it would have been. This causes the velocity perturbation interior to 
grow less quickly, creating a non-zero divergence away from the origin, which is why the CDM 
perturbation grows at non-zero radius. The size of this effect depends on the radiation to matter 
density; t his transforma tion of the CDM perturbation is the famous fc^^ tail of the CDM transfer 
function (jPeeblesl . Il98i l . The non-zero velocity perturbation is also the reason why the neutrino 
perturbation does not remain as a sharp peak. Finally, we note that this description of the behavior 
is the Green's function of the system. CMB Boltzmann codes typically compute the evolution of 
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Figure 11 The effects of non-hnear clustering on the BAO . (Left) Redshift-space ma tter correlation 
function at four different redshifts from the simulations of lSeo and EisensteinI (1200511. fRigh t ) Rea, l- 
space matter power spectra at four different redshifts from the simulations of Seo et al. ( 20081 ). 
divided by a smooth power spectrum so as to reveal the acoustic oscillations. The input linear 
theory is shown by the dashed line. The effects of non-linear structure formation broaden the 
acoustic peak in the correlation function. In the power spectrum, this corresponds to a damping 
of the higher harmonics. Importantly, the boost of broad-band power at late times visible in the 
power spectrum plot corresponds largely to correlations at scales much smaller than the acoustic 
peak. 



individual Fourier standing waves; these are simply combined to generate the response to a point 
perturbation rather than a single standing wave. 



4-3.2. Non-linear Evolution and Galaxy Clustering Bias 

The clustering of matter and galaxies undergoes substantial changes at low redshift beyond 
the growth described by linear perturbation theory. Small-scale structure grows non-linearly, 
peculiar velocities behave differently from their linear prediction, and galaxies trace the dark 
matter in a complicated manner. We should worry that these effects might modify the loca- 
tion of th e BAO feature relative to the prediction of linear theory^ thus distorting our stan- 
dard ruler dMeiksin et aP. '1999; 'S eo and EisensteinI. [iooi: I Angulo et al.l.l2005l:ISDringel et al.l . boosi : 
Jeong and Komatsul . 12006 : ,Huff e~l.l . l2007l : lAngulo et al.l . boosi : IWagner et al.1 . boOsF h 

Fortun ately, the large scale of t he acoustic peak insulates it from most of non-linear structure 
formation faisenstein et al.l . M . A typical pair of dark matter particles changes its comoving 
separation by 10 Mpc (rms value) between high redshift and z = 0. These motions broaden the 
acoustic peak, but the rms displacement is only mildly larger than the 8h~^ Mpc scale set by Silk 
damping. The apparent displacement along the line of sight is larger in redshift space, because the 
peculiar velocity is well correlated with the displacement. Figure [TT] shows the correlation function 
and power spectrum from N-body simulations; one can see that the acoustic peak in the correlation 
function becomes broader at low redshift. The corresponding effect in the power spectrum is the 
decreased amplitude of the wiggles at higher wavenumber. Roughly speaking, one can think of the 
width of the evolved £(r) peak as the quadratur e sum of the initial width and the rms pairwise 
displacement Snl (see Orban and Weinberg 201 ll . who examine idealized BAO models numerically 



and analytically). Equivalently, the oscillations in P{k) are damped by a factor exp(— A; S^^)- 
discussed in ^4.3.4| the broadening of the BAO feature does not significantly bias the acoustic scale 
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Figure 12 The shifts of the acoustic scale in cosmological N-body simulations. (Left) Shifts of th e 
acoustic scale in the redshift-space matter power spectrum versus redshift from lSeo et al. I (|2010bl l. 
The open symbols show the acoustic scale shifts prior to reconstruction; the dashed lines show 
a scaling of the square of the linear growth function. The solid symbols show the shifts after 
reconstruction is applied. The error bars are derived from the variance among simulations. (Right) 
Shifts of the acoustic scal e in th e redshift-space power spectrum of mock galaxy distributions at 
z = 1 from Mehta et al. ( 201ll ). The acoustic scale shift from the matter distribution in the 



same boxes has been subtracted so as to decrease sample variance. Galaxies are placed via HOD 
prescriptions; increasing mass thresholds leads to lower number densities and higher clustering bias. 
The open symbols show the shifts prior to reconstruction; the solid symbols, after reconstruction. 
The errors in the right panel are larger due to the smaller simulated volume and the lower number 
density of tracers. In all cases of both panels, reconstruction decreases the errors on the acoustic 
scale and reduces the shift to be consistent with zero. The left panel is based on 63 simulations, 
each using 576^ particles in a 2 h"^ Gpc cube. The right panel is based on 40 simulations, each 
with 1024^ particles in a 1 hr^ Gpc cube. 



measurement provided one is using a suitable template- fitting method. However, it does degrade 
the precision of the measurement for a given survey volume, as it is harder to centroid a broader 
feature. 

To change the acoustic scale itself, one needs instead to move pairs systematically closer or 
systematically further away. This is a much weaker effect than the rms motion of particles, as it 
depends on the density variations in 150 Mpc spheres, which are percent level. Moreover, pairs of 
overdensities fall toward each other and pairs of underdensities fall away from each other, and both 
sit uations count equally towa r d a tw o-point statistic, causing a partial cancellation. 

Padmanabhan and White ( 20091 ) comp ute the change in the acoustic p eak location at second- 



order in gravitational perturbation theory. Crocce and Scoccimarrol ( 20081 ) have done similar caL 
culations in renormalized perturbation theory. Both calculatio ns reveal a second-order term o f 
the form d^/dr, which corresponds to moving the acoustic peak. Padmanabhan and White (j2009l ) 
compute the size of this effect to be around 0. 25% at z = Q. 

N-body simulations reveal a similar story. Seo et al. ( 2010bl ) measure the shift in the acoustic 
scale in a large volume of simulations and detect a shift from a = 1 of 0.3% it 0.015% at z = 0.0, 
with a scaling in redshift proportional to t he square of the linear growth fun ction as expected for a 
second order effect (left panel of figure [T2|). iPadmanabhan and White (I2OO9I ) validate their analyt 
calculation with a similar set of simulations 
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Redshift-space distortions have further effects on the BAO signal beyond the extra broadening 
from the large-scale peculiar velocity. Small-scale velocities, e.g., the Finger of God effect, blurs 
the measurement of clustering along the line of sight, thereby broadening the acoustic peak. More- 
over, the peculiar velocities create anisotropy in the broadband clustering, which must be carefully 
accounted for when extracting the acoustic scale f ^4.3.4p . 

Linear bias, with galaxy density contrast Sg = bSm, changes the amplitude of ^(r) or P{k) but 
not the shape. However, any realistic bias relation must be at least somewhat non-linear, which 
alters the relative weighting of overdense and underdense regions and should sh ift the acoustic scale 
at se c ond order. Early w ork attempted to measure this shift in simulations (jSeo and Eisensteinl . 



2003 : Angulo et al. . 20081 ). but the vol ume of the simulations was insuffi cient to get a conclusive 



detection of the effect. More recently, IPadmanabhan and » explored galaxy bias as 

the ratio of the second -order to first - order bias term, finding shifts of a few tenths of a percent for 
reasonable bias cases. iMehta et al.l (|201lh treated the problem numerically with halo occupation 
distributions, finding shifts of 0.1% to 0.8% at z = 1 depending on the strength of the bias (right 
panel of figure [T2]) . For halo-based models or other prescriptions that tie galaxy bias to the local 
density field, it therefore appears that bias-induced shifts are small, and corrections of modest 
fractional accuracy (e.g., to 20% of the shift itself) will suffice to make them negligible. The 
relevant bias parameters should be tightly constrained by smaller scale clustering measurements 
and higher order statistics, enabling cross-checks of the model used for correction. 

Non-local bias models that tie galaxy f ormation efficiency directly to the environment on much 
larger scales (e.g.. iBabul and Whitelll99ll : iBower et al.lll993l ) could perhaps induce larger shifts of 



the acoustic scale. However, such models require fairly extreme physical effects , and they can be 



readi ly diagnosed via their impact on clustering at scales below the BAO scale (jNarayanan et al 
A survey capable of measuring the acoustic scale to the sub-percent statistial level will 



provide in its millions of galaxies extensive opportunities to constrain even very general bias models 
accurately enough to predict the acoustic scale shift to within 10-20% of its value, sufficient to bring 
the systematic error below the statistical error. 

4-3.3. Reconstruction 

By broadening and shifting the BAO feature in (,{r), non-linear gravitational evolution de- 
grades BAO precision and introduces a possible systematic. Is it possible t o remove these ef fects 
by "running gravity backwards" to reconstruct the linear density field? The Zel'dovich ( 1970l ) ap- 
proximation — in which particles follow straight line trajectories in comoving coordinates at the 
rate predicted by lin ear perturbation theory — captures importan t aspects of non-linear evo lution 
on large scales (e.g., Weinberg and Gunn 1990 : Melott et al. 19941 ). Eisenstein et al. I (l2007al ) show 
that a simple reconstruction scheme based on applying the (reversed) Zel'dovich approximation 
to the smoothed non-linear density field is remarkably successful at recovering BAO information, 
eff ectively shifting the l ow re dshif t curves in Figure [TT] back towards the high redshift curves. 

Seo and Eisenstein ( 2007 ) and Seo et al. ( 201 Obi ) investigate this reconstruction method in more 



detail, showing that it noticeably improves the scatter and decreases the shift of the recovered acous- 
tic scale from the matter density field of N-body simulations. The latest simulations demonstrate 
that the non-linear shift of the scale has been removed to 0.02% or better (see Figure [T2l left). 
Moreover, comparing the initial conditions to final conditions on a mode- by- mode basis shows that 
the linear density fie l d has been recovered to roughly double the pre-reconstruction wavenumber. 
Padmanabhan et al. ( 20091 ) analyze the method analytically, revealing the improvement while also 
not ing that the r ecover ed density field is not exactly the linear one. 

Mehta et all (|201lh extend this analysis to HOD-based mock galaxy catalogs in simulations. 



They consider a range of HOD prescriptions and find that the reconstruction of the linear density 
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Figure 13 A pedagogical illustration of how reconstruction can improve the measurement of the 
acoustic scale; this figure is from Padmanabhan et al. (in prep). Each panel shows a thin slice of 
a cosmological density field. (Top Left) At early times, the density is nearly constant. We mark a 
set of points at the origin in blue and a ring of points at 150 Mpc in heavy black. We measure the 
distances between the black points and the centroid of the blue point; the rms of these distances 
is represented by the Gaussian in the inset. (Top Right) At later times, structure has formed (in 
this calculation, simply by the Zel'dovich approximation), and the points have moved. The red 
circle shows the initial radius of the ring, centered on the current centroid of the blue points. The 
fact that the black points no longer fall on the red ring indicates that the acoustic peak has been 
broadened. The inset shows that the new rms of the radial distance (solid line) is larger than the 
original (dashed line). (Bottom Left) Arrows show the Zel'dovich displacements responsible for the 
structure that has formed. The idea of reconstruction is to estimate these displacements and move 
the particles back. (Bottom Right) We illustrate this by smoothing the density field by a 10h~^ 
Mpc filter and moving the particles back. Because the displacement field is imperfectly estimated, 
small-scale structure remains. But the black points now fall closer to the red ring, so that the 
rms of the radial distance is close to the initial (inset). The actual reconstruction algorithm of 
Padmanabhan et al. (in prep) is more complex, but this example shows the basic opportunity. 

field is not degraded by this form of galaxy bias and that the shift of the acoustic scale after recon- 
struction still vanishes, this time to 0.1% precision (Figure [T2t right). This success is not surprising: 
the halo field traces the matter field fairly accurately on the scales required for reconstruction, so 
one is correctly estimating and removing the large scale displacements. Non-linear galaxy bias still 
alters the weighting of convergent and divergent flows, but if the flows are being mostly removed, 
then it doesn't matter how they are weighted. 

Reconstruction is thus a powerful tool: one is achieving better statistical precision for a given 
survey, typically by a factor of 1.5 to 2, equivalent to a factor of 2 — 4 increase in survey size. 
Meanwhile, one is mitigating the primary systematic error from non-linear clustering and galaxy 
bias. As an added benefit, one can use the estimate of the large scale displacements to remove 
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large-scale redshift-space distortionso decreasing that degradation of the BAO accuracy and also 
pushing most of the BAO signal into the monopole and quadrupole components of ^(s) or -P(k). 
Without reconstruction, the redshift distortions contain significant terms in the hexadecapole and 
beyond, and the quadrupolar squashing of the Alcock-Paczynski effect couples to the quadrupole 
redshift distortion to produce BAO signal in the hexadecapole. To the extent that one is recovering 
the linear density field, one can also hope that the large-scale density field is more Gaussian, which 
is a major simplification for computing likelihood functions. However, this last property has not 
been extensively tested. 

T here is an extensive litera t ure on reconstruction inethods for large-scale structure (e.g., 



Peebles 



19891 : [Weinberg and Colelll992l : iNusser and Dekellll992l :l ICroft and Gaztanagalll997l : [Narayanan and WeinbergI 



19981 : Mohavaee et al. 20061 ). Even simple methods appear adequate for BAO recovery, bu t better 
reconstruction is valuable for other apphcations of large-scale structure (jPeid et al.l . bM ). Since 



BAO surveys are typically sparse, an important area for continuing research is the performance of 
methods in the presence of both galaxy bias and significant shot noise. The effectiveness of recon- 
struction as a function of sampling density might have important implications for survey design, 
favoring different choices compared to the statistical considerations discussed in ^4.41 below. 



4.3.4- Fitting to Data 

It is worth stressing that "the acoustic scale" is only an approximate description of a more 
complicated physical situation. For high precision work, we cannot separate the concept of the 
acoustic scale from the context of a Boltzmann code prediction for the matter power spectrum and 
CMB anisotropy power spectrum. The sound horizon defined by equation ()5ip does not correspond 
to the exact maximum of the acoustic peak in the correlation function, nor do the harmonics in 
the matter power spectrum have an identical scale to those in the CMB anisotropy spectrum. The 
differences arise from effects such as the fact that photons decouple from the baryons earlier than 
the baryons decouple from the photons, that the post-recombination matter growing mode is largely 
set by the velocity perturbation at recombination rather than the density perturbation, and that 
Silk damping alters the effective redshift of recombination as a function of wavenumber. These 
effects are accurately calculated in the Boltzmann codes, resulting in precise predictions for the 
matter and CMB power spectra. 

When one wants to extract the acoustic scale (i.e., to measure distance using the BAO standard 
ruler) from a measurement of the two-point clustering, the appropriate thing to do is to use the 
predicted clustering for the cosmology one is testing as a template. The optimal plan is then to fit 
that template to the data over a range of scales using the correct covariance matrix or likelihood. 
Some early works instead used non-parametric models for the acoustic peak, such as a G aussian in 
configuration space or a damped sinusoid in Fourier s pace dSlake and Glazebrookl. boO,"^ ) . or s imply 



identified the maximum of the correlation function ( Guzik et al. . 2007 : Smith et al. . 20081 ). We 



believe that, because the acoustic scale is predicted only in the context of an early universe model 
with parameters taken from fits to CMB data, there is no extra value in avoiding the linear-theory 
model predictions. 

Having said that, one does want to modify the template to allow for effects of non-linear 
structure formation and perhaps to marginalize over broad-band terms that might enter from 
scale-dependent clustering bias or velocity bias or from errors in the calibrat ion of one's survey. 
This procedure has been carried out by several different authors. For example, ISeo and Eisenstein 



^^The line-of-sight peculiar velocity is, in the Zel'dovich approximation, equal to f(z)H{z) times the line-of-sight 
component of the 3-d displacement. 
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and and ISeo et alJ (|2010bl l it the measured power spectrum to the form 



Prr 



,{k) = B{k)P^{k/a)+A{k) , 



(53) 



where A{k) and B{k) are smooth functions with parameters to be fit. Pm{k) is the hnear theory 
model with the acoustic oscihations additionahy damped by large-scale structure, 



P^ik) = exp(-/c2SNL)(Pim(A:) - P„w(A:)) + Pnw(A:), 



(54) 



where Snl is a constant fit from simulations. Pnw is the no- wiggle power spectrum from lEisenstein and Hu 
(| 19981 ). ! land-crafted to edit out the acoustic oscillations. P\\^ is the exact linear theory power spec- 
trum; note that Pm{k) goes to this exact linear theory result in the limit of negligible damping 
(Snl — )• 0), so the approximate form of Pn„{k) is acceptable. The broadband terms A{k) and 
B{k) will correct for non-linear power, the shot noise, scale-dependent bias, and any imperfections 
in the survey. The primary goal for the fit is to measure a, a factor that dilates the scale of the 
predicted clustering (and of the BAO feature in particular) relative to the observed clustering. A 
value a = 1 indicates agreement with the acoustic scale of the original model. A value a ^ 1 
indicates that the acoustic scale of the linear-theory model is incorrect or that the distance scale 
assumed in measuring the galaxy clustering was wrong. Simple alterations of this prescription can 
be made for fitting the co rrelation function or mixed-space w or wavelet statistics (|Xu et ah . 2O10l : 



Arnalte-Mur et al.. 2011 



This appproach thus allows one to fit for the scale of a standard ruler 
without having to recompute a full predicted power spectrum at every point in parameter space. 

This fitting procedure is only compelling to the extent that the recovered value of a is stable 
(to within the statistical errors) as one varies the prescription for the marginalization of parameters 
in A(k) and B(k). Too little freedom and one may be biased by broadband tilts and modulations 
that one hasn't modelled properly; too much freedom and one will fit out the acoustic signature 
and reduce the constraining power of the data. Fortunately, the separation between the acoustic 
scale and the typical non-linear scale and Silk damping scale is large, i.e., the acoustic peak in 
the correlation function is narrow. This gives considerable free dom to fit away broadband nuisance 
terms while not impacting the acoustic peak. ISeo et al.l (l2010bl ) show stable results for a for various 
choices, e.g., polynomials of different order. Similarly, a is robust to changes in the choice of Snl; 
so one is not sensitive to how one estimate s that parameter in simula tions or mock catalogs. 

An equivalent method has been used by IPercival et iZI tooj . hoid ) , and in related works. Here, 
one fits a spline to the measured power spectrum and divides by that spline. One does similarly 
for the template Pm{k) and fits that to the residual spectrum of the data. This is equivalent to 
taking B{k) to be a spline and setting A[k) = 0. Clearly the performance depends on the number 
of spline points, but there is a broad stable region. 

The definition of the acoustic scale as the distance a primordial sound wave could travel before 
recombination (eq. l5ip is borne out in such fits. If one fits with the power spectrum from a cosmology 
that is moderately wrong, then one infers a different a, but this change in a is proportional to the 
ratio of the acoustic scales, as defined by the sound horizon integral for each cosmology. The 
stability of this scaling appears to be much better than the statistical e rrors implied by t he surveys 
that are defining the range of interesting cosmological parameter space (jSeo et al.l . l2010bl ) . In other 
words, one can use the acoustic scale integral to adjust distance scale measurements of Da/ts and 
Hrs between different cosmologies within the domain of interest. 

Extending these approaches to the anisotropic case so as to extract Da and H separately is more 
complicated and has not been fully developed. The primary obstacle is to ac count for the anisotropi c 



distortions from peculiar velocities. Examples of fit methodologies include lOkumura et al. 
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Padmanabhan and White ( 20091 ). Shoii et al. ( 20091 ). Chuang and Wane ( 2011 ). and Kazin et al 



(|201ll ). 

With better modehng of non-Hnear structure and galaxy clustering bias, one could of course 
extract additional cosmological information from the two-point clustering of galaxies. In particular, 
one can measure the distance scale from the curvature (i.e., non-power-law form) of the spherically- 
averaged power spectrum or correlation function. This physical scale arises from the size of the 
horizon at matter-radiation equality, parameterized as 0^^^ in typical cosmologies. However, this 
curvature is a much broader feature and thus provides less leverage on distance. Most important, 
the width of the feature is comparable to the scale itself, implying that one must control all extra 
broad-band sources of power and scale-dependent galaxy biases in order to extract accurate distance 
information. This is much more challenging than the BAO application, but it is an important 
frontier of the field of large-scale structure. In particular, the application of this approach to the 



quadrupole distortion known as the Alcock-Paczynski effect will be discussed in ^7.31 

4-4- Observational Considerations 
4.4- 1- Statistical Errors 

The primary challenge of the BAO method is that very large samples of galaxies (or other 
tracers) are required to detect the acoustic oscillations and hence measure a distance. Like detecting 
an emission line in a galaxy spectrum in order to measure a redshift, one must have high enough 
signal-to-noise to detect the BAO peak or one gets no useful distance information at all. The 
minimum useful survey volumes are of order lh~^ Gpc^, which yield a distance precision of about 



The two components of the statistical error are sample variance and shot noise (iKaiserl . \l9H62 ) 



A given survey volume contains only a certain number of Fourier modes; in the periodic box ap- 
proximation diVmodes/c^^ = 47r/c-^y/[(27r)^2] , where the final factor of 2 in the denominator accounts 
for the fact that the density field is real. In a Gaussian random field, the real and imaginary parts 
of each mode are independent with an intrinsic variance of P{k)'^/2. In addition, each mode is 
imperfectly measured due to shot noise; when treated in the Gaussian approximation (ignoring the 
4-point contributions from the Poisson distribution), this raises the variance on the complex norm 
to [P{k) + l/n]^, where n is the number density of tracers. The result is that the fractional error 
bar on the measurement of each mode is crp/P = {nP + l)/nP. When combining information from 
modes, we should sum the inverse variances, which are 

"'^ ^ (55) 



af, \nP + l 

We see that for n ^ \/P{k) we get unit information from each mode but that the information 
drops rapidly for n < 1/P{k). We note that the relevant P is the redshift-space power spectrum; 
this can be substantially larger than the real-space power spectrum for nearly radial large-scale 
modes, thereby decreasing the shot noise impact on BAO estimation of the Hubble parameter. 

The mode-counting argument above neglects boundary effects, effectively assuming that the 
survey volume is reasonably contiguous with a high filling factor on scales of 150 Mpc. In real 
space, we can express this as the requirement that the number of pairs of survey galaxies at 150 
Mpc separation not be significantly diminished compared to the case of a filled periodic box. In 
Fourier space, we must ensure that the survey window function not create aliasing between modes 
in the crests and troughs of the acoustic oscillations. 

Converting a power spectrum forecast into constraints on the distance scale requires marginaliz- 



ing over other cosmological parameters. This has been done with Fisher matrix analyses (ISeo and Eisenstein 
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Glazebrook and Blakd . l2005l : iBlake et al 



2003 : Seo and Eisenstein . 2007) or with Mont e Carlo approaches ( Blake and Glazebrook . 20031 : 

2006l l. Several analyses have focused on the anisotropy 



of the power spectruni in ord er to measure H{z) and Da{z) separately. 

Seo and Eisenstein (j2007l ) constructed a fast approximation to the full Fisher matrix calculation 



using an idealized treatment of the acoustic oscillation, including non-linear structure formation 
and redshift distortions. This method allows forecasts for H and precision as a f unction of 



survey redshift, number d ensity, and volume (see their eq. 26). Tests with simulations (jSeo et al. 



rev 

2010bl : iMehta et al.l . l201ll ) have shown this forecast to be accurate to within 10-20%, with a small 



trend toward over-optimism at nP < 1. Whether this trend is intrinsic to shot noise or to the fact 
that the low number density models used more massive mass thresholds for halo bias is not clear 
at present. 

T able [2] presents a summary of cosmic variance limited BAO performance. This is a tabulation 
of the Seo and Eisenstein (I2OO7I ) forecasts for a full-sky survey, using even binning in ln(l -|- z). We 
assume a shot noise level of nP = 2 aX k = 0.2 h Mpc~^ (see § I4.4.3p . and that reconstruction has 
decreased the non-linear displacements by a factor of two in length scale, i.e., reducing the quantity 
Snl in equation ()54p by a factor of two below its full non-linear value at each redshift. Figure [U 
discussed further in the next section, presents graphical summaries of the main features of Table [2j 
One can see that the precision available in Da{z) and H{z) is excellent: of order 0.2% per redshift 
bin at high redshift. At low redshift, the precision is worse because there is far less cosmic volume. 

— 1/2 

Of course, these statistical errors scale as f^-^^ , where /sky is the fraction of sky surveyed. 

From BAO to Dark Energy 
We will explore how these performance estimates map to dark energy parameter forecasts in 
^ but here we describe some simplified treatments in order to build intuition. Beginning at low 
redshift, if we consider that CMB anisotropics give precise values for Clmh'^ and t he acoustic scale r.^ , 
then a BAO detection near 2: = is measuring a standard ruler and hence Hq (jEisenstein and Hu , 
19981 ). Combining that with Vlmh? yields Q.m- No BAO measurement can be strictly at z = 0, 
but the inference of 0^. and Hq depends only on the distance scale between z = and the survey 
redshift. Even at 2; = 0.35, this brings in only a mild dependence on w and Vt^ (jEisenstein et al. 



2OO5I ). Hence, low-redshift BAO measurements offer a strong measurement of Vtrn- Determining 
breaks a key degeneracy for the SN measurements between 0,^, and w. 

Moving to higher redshift (2 > 1), we next consider the evolution of the density of dark energy 
using only the H{z) information from BAO (right panel of Figurell4p. If we know the matter density 
and spatial curvature perfectly, then the Friedmann equation directly relates the measurement 
of H{z) to the density of dark energy at that redshift. Considering the null hypothesis of the 
cosmological constant, we would achieve a detection of the dark energy density with a significance 
of il,\H/2a{H), where cr{H) is the error on H{z). We next want to consider the variation in the 
dark energy density. Taking an example in which one assumes the z = value is known perfectly, 
we can translate the error at a given redshift to the error on the exponent of a power-law variation, 
which can in turn be rewritten as an error on a constant w (eq. I23p . Of course, a full analysis must 
include the uncertainties on the matter density, spatial curvature, and 2 = value of VL\. 

Despite the simplifications, Table [2] and Figure O offer some important results for building 
intuition. We find that the sensitivity of BAO H{z) to dark energy has a broad maximum over 
the range 0.6 < z < 3.5. This plateau arises because the declining dynamical importance of dark 
energy is compensated by the increasing statistical precision afforded by larger comoving volume. 
For w = —1, dark energy is only 10% of the total density at 2 = 2, but a cosmic-variance-limited 
BAO measurement can detect that density at 20-0" significance. The large lever arm to 2 = 
translates this into a 1.3% constraint on a constant w model. Of course, \{ w > —1, then poE is 
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Table 2. BAO Forecasts for a Full-Sky BAO Survey 



•2^min 




Volume % Err Da{z) 


/o ihll J^{z) 




Q /TV 




0.00 


0.15 


0.33 


2.8 


4.9 


0.708 


7.3 


0.64 


0.15 


0.32 


2.62 


0.95 


1.7 


0.616 


18.2 


0.088 


0.32 


0.51 


7.89 


0.53 


0.96 


0.515 


27.0 


0.036 


0.51 


0.73 


16.5 


0.35 


0.63 


0.413 


32.9 


0.021 


0.73 


0.99 


28.4 


0.26 


0.46 


0.318 


34.9 


0.015 


0.99 


1.28 


42.9 


0.21 


0.36 


0.236 


33.3 


0.013 


1.28 


1.62 


59.0 


0.17 


0.28 


0.170 


30.2 


0.012 


1.62 


2.00 


75.8 


0.14 


0.24 


0.119 


25.2 


0.013 


2.00 


2.44 


92.3 


0.13 


0.21 


0.082 


20.0 


0.014 


2.44 


2.95 


108 


0.12 


0.18 


0.056 


15.5 


0.016 


2.95 


3.53 


121 


0.11 


0.17 


0.038 


11.4 


0.020 


3.53 


4.20 


133 


0.10 


0.15 


0.025 


8.3 


0.025 


4.20 


4.96 


142 


0.10 


0.15 


0.017 


5.8 


0.033 


Note. 




These 


forecasts assume 


a full-sky survey, use 


nP 


= 2 at 



k = 0.2h Mpc , and assume reconstruction improvements in the non-linear 
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damping by a factor of 2. Statistical errors scale as f^^.^' . The first and 
second columns give the inner and outer edges of rcdshift bins; the bins have 
equal width in ln(l -|- z). The third column gives the comoving volume of the 
bin in Gpc^, assuming = 0.25. The fourth and fifth columns give 
la fractional errors (in percent) in Da{z) and H{z), the angular diameter 
distance to and Hubble parameter at the bin center; note that the errors 
on these two quantities are 40% correlated. We assume the sound horizon 
is known. The sixth column gives 0,\{z), i.e., the ratio of the dark energy 
density to the critical density at that redshift in a A-model. Column 7 gives 
^\{z)H /2aH , which is the significance at which one would detect the cos- 
mological constant at redshift z using only the H{z) BAO constraint and 
assuming perfect knowledge of the matter density and curvature (a good ap- 
proximation, but not exact). Column 8 shows the error on a constant value 
of w that would be obtained by comparing the BAO H{z) measurement for 
this one redshift to the value of /9de at 5; = 0, assuming the latter is known 
perfectly. 
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Figure 14 Illustrative BAO forecasts for a full-sky survey, from Table El All errors scale as /g^y j 
note that y-axes are inverted so that smaller errors appear higher on the plot. (Left) The fractional 
error on Dji{z) and H{z) in logarithmic redshift bins, as open and solid points, respectively. Note 
that performance of order 0.2% per bin is possible at high redshift. Here we assume the sound 
horizon is known. (Right) Illustration of the dark energy leverage available simply from the H{z) 
information in the previous panel. Assuming perfect knowledge of the matter density (i.e., rj^-f^o) 
and curvature, measurement of H{z) determines the dark energy density. The solid black points 
show the resulting fractional errors on the dark energy density as a function of redshift, assuming 
that the value is close to the cosmological constant. Errors of order 5%, i.e., a 20-0" detection of 
dark energy, are possible at 0.5 < z < 2, even if the dark energy density is simply constant. The 
evolution of dark energy can be expressed by comparing the density at high redshift to that at 
z = 0, assuming the latter is known. The open red points give the error on a power-law evolution 
in 1 + z, expressed as the error on a constant w. One sees that there is a broad maximum in 
performance extending out to z ~ 3. Of course, we must measure the dark energy density at z = 0; 
the blue arrow shows the fractional error on that density that would result from a 1% measurement 
of Hq (which one might get from direct measures or from a combination of BAO and supernovae) , 
assuming perfect knowledge of the matter density. That the blue arrow is comparable to or above 
the solid points indicates that we can reasonably expect to be limited by our higher redshift data. 
The open points are optimistic in that we have assumed perfect knowledge of various inputs; the 
intended lesson is that the large volume and larger redshift lever arm at higher redshift can offset 
the fact that the dark energy makes up a smaller fraction of the cosmic total. 



higher at high redshift than it is for a cosmological constant, increasing the statistical significance 
with which BAO can detect it. 

Meanwhile, the transverse acoustic scale at z ~ 2 and above can be compared to the angular 
acoustic scal e in the CMB to give a combinatio n constraint on early dark energy and the curvature of 
the universe ( McDonald and Eisenstein . 200?! ). This has considerable value in breaking degeneracies 
between curvature and dark energy parameters at lower redshift, and it should be considered an 
important consistency check for the ACDM interpretation of the CMB. A clear detection of non- 
zero curvature would have major implications for inflation, and perhaps for quantum cosmology 
theories. 
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4-4-3- Sampling Density 

The acoustic oscillations in the power spectrum are primarily at wavenumbers 0.1—0.2 h Mpc~^, 
so we want to design surveys with nP{k = 0.2 h Mpc~^) > 1. Furthermore, if one wins sample size 
proportionally to survey time, t hen nP ( k) = 1 is the optimal balance of survey depth to sample 
volume at a given wavenumber ( Kaiseil . 1986al ). One should beware that this assumption rarely 
holds in surveys with multi-object instruments: the exposure time is driven by the faintest objects 
in the survey, so that brighter galaxies are being overexposed in the chosen observation time. Also, 
the number density is often a function of redshift, so one cannot hit the optimal density everywhere 
in the survey. Finally, one might care about distance precision dif ferently at differ e nt re dshifts 
because of one's specific goals for testing dark energy models. See [Parkinson et al.l (120071 ) for a 
worked example of survey optimization. 

For the concordance cosmology, the amplitude of the power spectrum at A; = 0.2 h Mpc^^ is 
about 2700(Tg Mpc'^, where Ug g is the variance of the fractional overdensity of the chosen tracer 
at the survey redshift in spheres of 8h~^ Mpc radius. This implies that we seek number densities 
around n = (4 x lO'^h^ Mpc~^)/(Tg^. Fortunately, this is well below the density of L* galaxies. 

Higher galaxy bias is a good thing for the statistical errors of a BAO survey. The power 
spectrum amplitude scales as the square of the bias, so an early-type galaxy is 3 — 4 times more 
valuable (in the sense of boosting nP) than a late-type galaxy. Given that there is no identified risk 
— higher bias galaxies have larger acoustic scale shifts, but this is correctable (Figure \T2\ right) — 
it makes sense to use higher bias tracers when possible. However, lower bias tracers can be more 
effective if one can acquire their redshifts sufficiently quickly! 

The balance of shot noise to sample variance is more complex in the case of surveys with 
the Lya forest or HI intensity mapping. However, the idea is the same: one wants to make a 
map in which the pixel noise is dominated by sample variance, but not by much. The power at 
k = 0.2 h Mpc~^ corresponds roughly to density variance of 8 Mpc spheres. Hence, we seek to 
measure the density of individual regions of this size to a precision slightly better than the intrinsic 
rms for such volumes for the chosen tracer (i.e., bas)- If one is measuring too well, one would prefer 
to do shallower measurements over a wider region. In the case of the Lya forest, this criterion 
concerns both the areal density of the quasar sightlines and the signal-to-noise ratio of the spectra 
(jMcDonald and Eisensteinl . boO?! : iMcQuinn et al.l . boill ) . 



4-4-4- Spectroscopic vs. Photometric Redshifts 

Photo metric redshifts offer a cheap way to measure many galaxy redshifts and he i ice to measure 
the B AO (|Seo and Eisensteinl . [iooi : iGlazebrook and Blakd . l2005l : iDolnev et al.l . l2006l : ISeo and Eisensteinl . 
20071 ). However, the larger errors are a challenge. For velocity errors larger than 1000 km s~^ one 
is smearing out the acoustic scale along the line of sight and failing to measure H[z). Note that 
this scale is set by the width of the acoustic peak, not by the acoustic scale. One only retains full 
information with rms precision below 300 km s^^. 

To measure Da{z)., in principle precisions of crz/(l + z) of 4% are enough. Worse precision 
causes catastrophic degradation because the oscillations in angular power at the front and back of 
the photometric redshift slab fall out of phase. Redshift precision of 3-4% yields poor constraints 
on the BAO per unit volume, with a rule of thumb th at one needs ten times rn ore volume for a 
photometric redshift survey than a spectroscopic survey (ISeo and Eisenstein . I2OO7I V Better redshift 
precision reduces this gap. At 2: < 0.7, current and ongoing spectroscopic surveys are already 
covering 1/4 of the sky, so photometric redshift surveys are only competitive at higher redshifts. 
Extracting large-scale structure and BAO from photometric redshift surveys requires very stringent 
calibration and more extensive modeling than for spectroscopic surveys. Photometric surveys with 
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many narrow bands offer an interr nediate approach be tween imaging and spectroscopy, which may 



be advantageous in some regimes (|Benitez et alJ . l2009l ). 



4.4-5. Tracers of Structure 

As we have seen, BAO surveys require surveys of very large volumes with modest sampling 
density. One wants to map a wide range of redshift so as to measure the history of expansion. The 
current generation of surveys are mapping of order 1 million galaxies, and approaching the cosmic 
variance limit at 2: > 1 requires of order 10^ galaxies. 

We have a lot of freedom in selecting the objects to trace the density field. Usually, we require 
isotropy of the selection but do not require that the selection be unchanging as a function of redshift. 
One is seeking to minimize the observat ional cost for a given well- sampled survey volume. There 
are many competing considerations (G lazebrook and Blake . 20051 ). One desires a tracer with a 
strong spectroscopic signature to allow a redshift determination to about 300 km s~^ rms as fast as 
possible, with few catastrophic redshift errors. One desires a combination of density and clustering 
bias so that nP{k = 0.2h Mpc~^) > 1. One desires a higher clustering bias, so that the required 
number density is lower; this allows one to use brighter objects and reduce exposure time. For 
targeted surveys, one desires that the tracer can be readily selected, so that one doesn't waste 
resources on undesired objects. In more detail, one desires photometric redshifts good enough that 
one can shape the n{z) profile in a way that keeps nP close to unity at high redshift without being 
swamped by low luminosity objects at low redshift. And, of course, the observed wavelength of the 
spectroscopic feature determines a great deal about the instrumentation. 

Luminous red galaxies are an effective choice at lower redshifts. They have strong absorption 
features, notably the 4000A break, and high surface brightnesses to allow rapid spectroscopy. They 
have a high bias (5 ~ 2) to reduce the required number density and hence the number of spec- 
troscopic fibers. They are also easy to se lect with photornetric redshifts: essentially they are the 
reddest galaxies at a given observed flux ( Eisenstein et al. . 200 ll ). 

As we work to 2; = 1 and beyond, the advantages of using emission-line galaxies increase. Red 
galaxies are very faint in the optical at 2; > 1 because of K corrections, and the 4000A break moves 
into the infrared where the forest of OH sky glow lines makes spectroscopy more difficult. But 
the star formation rates of normal galaxies at 2 > 1 are about ten times higher than today, and 
this high star formation produces strong emission lines. These emission lines can be detected even 
when the stellar continuum cannot, and the galaxies with the strongest lines can be measured in 
remarkably little time. Spectral resolution of a few thousand is desirable to work between the OH 
sky glow lines and to resolve the [OH] doublet. The challenge is primarily one of selection, how 
to use photometric data to pick out the star forming galaxies with the strongest lines. Between 
the lower clustering bias and the failure rate on weaker- lined systems, one needs to survey many 
more blue galaxies than red. Curr ent expectations are that the transition point from preferring 
red galaxies versus blue is 2 f« 1 (iGlazebrook and Blakd . l2005l ). Slitless spectroscopy offers an 
alternative route to surveying emission line galaxies, wit hout prior target selection ( see ^4.61) . 

Clusters of galaxies have been proposed as tracers (jAngulo et al.l . 1200,4 iHiitsi l2006|). These 
can be readily found from imaging data sets but have the disadvantage that their nP does not 
reach unity. Also, acquiring spectroscopic redshifts for the clusters imposes requirements similar 
in area and depth to a red galaxy survey, so the gain in a highly multiplexed fiber survey is much 
smaller than one would expect based on numbers alone. Quasars have a similar problem of having 
nP < 1, but they are extremely lumin ous and easy to select. T his makes them a possible target 
for a sparse, wide- field survey at 2 > 1 ( Sawangwit et al. . 2011bl ). and they are readily added to a 
multi- fiber survey targeting emission line galaxies or LRGs at other redshifts. 
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Using the Lya forest as a tracer is attractive because each spectru m yields many density mea- 



surements (effectivel y about 50) rather than , just a single point in a map (|Whitd . l2003l : iMcDonald and Eisensteinl . 



20071 : iNorman et all . I2OO9I : iMcQuinn eraP . l201lh . One wants to sample the width of the acoustic 



peak, which is about 20h Mpc FWHM. This implies that one needs spectral resolution of only 
a few hundred and moderate angular density of the lines of sight, preferably about 100 per square 
degree. Quasars of this surface density are much brighter than the Lyman-break galaxies that 
would be required to match the effective sampling density. As one has little gain from reducing 
the photon noise errors to below the intrinsic variation of the forest on 10 Mpc scales, one 
can afford to use low signal-to-noise ratio spectra. The challenge here is systematics, as one must 
control the continuum of the quasar and the spectrophotometry of the measurements to utilize the 
spectral information. It is also possible that theory systematics associated with the state of the 
IGM enter; so far, IGM uncertainties have not been shown to affect BAO measurements from the 
forest, but the case has not been investigated as thoroughly as it has for galaxies. 

Star-forming galaxies also can be observed in the radio using the 21 cm line of neutral hydrogen. 
This is a much weaker line, but future generations of radio interferometers such as the Square 
Kilometer Array offer phenomenal survey speed because one can synthesize millions of simultaneous 
beams computationally. Such instruments could in pri nciple achieve spectroscopic samples of 10^ 
galaxies out to 2; = 2 — c 



A different coii c ept i s that of 21 cm in t ensit 



20081 : IChang et al.l . l2008l : iLoeb and Wvithd . l200 



maooine (Peterson et al.. 2006: 


Ansari 


et al.. 


Wvithe et al.. 2008: 


Seo et al.. 


2010a). 


Here 



one does not identify individual galaxies but instead measures the combined emission of the 21 
cm line from all galaxies in a volume of order 10 Mpc on a side. The fluctuations in the map 
encode the large-scale density field and hence the BAO. Relative to an interferometer like the SKA, 
one uses shorter baselines (around 300 meters) and a nearly filled aperture to maintain surface 
brightness sensitivity@ Because one is not resolving individual galaxies above the instrumental 
noise, one is using all of the neutral hydrogen even from low-mass galaxies. In principle, one can 
map the BAO to the cosmic variance limit out to z ~ 3 with new interferometric arrays. The 
challenge is foreground subtraction, as the cosmic signal is several orders of magnitude below the 
Galactic and extragalacti c emission lev e ls. A first detection of large-scale structure in redshifted 21 
cm has been reported by Chang et al. ( 2O10l ) by cross-correlating with an optical galaxy redshift 
survey at z = 0.8; cross-correlation removes foregrounds that are not themselves correlated with 
the optical galaxies. For intensity mapping to work on its own, one of course needs to measure the 
auto-correlation signal. 

Unlike for the case of galaxies, diffuse HI mapping does not provide the mean level of emission 
(interferometers are not sensitive to this, and even if they were the Galactic emission would swamp 
extragalactic HI); therefore (5hi is measured only up to a multiplicative constant. This does not 
present a problem for the BAO technique because one is using the shape and not the amplitude 
of the power spectrum. It does have an impact on redshift-space distortions ( ^7.2p . as without the 
mean level one cannot turn the observable f3 into an estimate of the rate of growth of structure 
fas{z). This drawback is, however, also an opportunity to learn about astrophysics: measurement 
of /3hi combined with independent knowledge of fas{z) would allow us to infe r the mean HI signal 
and thus obtain the cosmic abundance of neutral gas as a function of time (jWvithe and Brown . 

20 id ). 



^''An interferometer directly measures the Fourier transform (in the transverse direction) of the emission field; 
antennas separated by a distance L measure Fourier modes with k± = 2ttL/{XDa)- 
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4-5. Systematic Uncertainties and Strategies for Amelioration 

Given that we seek to measure the acoustic scale and hence cosmic distance scale to a high 
level of precision, it is important to consider the systematic errors that could cause the inferred 
DA{z)/rs and H{z)rs to be incorrect. We consider three classes of systematic errors: (1) Observa- 
tional errors, in which one mis-measures the large-scale structure of the universe; (2) Astrophysical 
errors, in which our model of large-scale structure for a given cosmology is incorrect; and (3) Cos- 
mological errors, in which we mis-predict the sound horizon given our measurements because of 
new cosmological physics, either in the the early universe or at last times. 



4-5.1. Measurement Systematics 

The measurement of large-scale structure requires the ability to produce a well-calibrated den- 
sity map of the universe. The data need not be homogeneous in quality so long as the inhomo- 
geneities are known well enough that one can correct for them statistically. 

Observational errors involve imperfections in one's map of the density field. Examples of sources 
can be photometric miscalibrations of the input catalog, mis-assessments of the incompleteness in 
the input catalog, redshift failures or errors, incorrect tracking of the target selection, failure to 
correct for deleterious interactions between targets (e.g., fiber collisions), or imperfect assessment 
of the redshift distribution of the map. Another class of problems involves understanding of the 
errors of the map, as one must assess both the statistical properties of the density field and the 
point sampling of it by galaxies. 

Fortunately, these issues have been extensively stud ied in the general context of the measure- 
ment of large-scale structure (e.g., Tegmark et al. . 19981 ). BAO measurement itself is only a partic- 



ular application of large-scale structure data, and it turns out to be a relatively easy one because 
the acoustic peak is narrow in scale and hence one has another differential opportunity in the ex- 
perimental design. That is, one can compare the behavior at 150 Mpc separation to the average of 
that at 120 and 180 Mpc, so as to remove smooth errors. The only way to produce a non-smooth 
error is to have a sharply preferred scale in the systematic error. 

For galaxy redshift surveys, there is wide expertise in how to calibrate surveys and track their 
selection functions, and there are many tests that can be employed to look for specific prob- 
lems. Failing that, residual errors are often intrinsically radial or angu lar in their nature, s o one 
can reject the purely radial and purely angular modes from a survey ( Vogelev and Szalav . 19961 : 



Tegmark et al.l . Il998l ). This is a small cost in information content for an intrinsically 3-d field. 



A more targeted version of this idea is to use angular templates to remove systematic errors with 
particul ar angular dependence, e.g., surve y depth variations due to sky brightness, seeing, or stellar 
density (IHo et all . I2OO8I : IRoss et all . l201lh . A further related idea is that for a sharp scale in a sys- 
tematic error to be a real threat, it must be sharp for three dimensional spheres of separation. For 
example, even if a survey has an error that is modulated on a circular field of view, the diameter 
of the field affects a range of 3-d separations at a given redshift simply because of the random 
orientations of pairs to the line-of-sight. 

The BAO method is ultimately tied to the separation of galaxies, which depend on astrometric 
positions and redshifts. These quantities can be exquisitely well measured, and achieving 0.1% 
precision on one's astrometric and wavelength scale is easy. The concern about systematic errors 
in the map is that an erroneous tilt in the correlation function would cause one to mismeasure the 
centroid of the acoustic peak. This is a weaker effect, and one can marginalize against such tilts if 
one wants, using the techniques in §4.3.41 

In short, it is very likely that a reasonable design for an galaxy redshift survey will lead to 
sufficient accuracy for the BAO method. The greater challenge for such surveys is to control the 
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clustering analysis for the broadband cosmological signals, which require a factor of more than ten 
better accuracy. 

On the other hand, the observational systematics for the Lya forest and 21 cm intensity mapping 
techniques are a serious concern. Here we are trying to use every spectral pixel for our mapping 
data, rather than differencing spectral pixels to measure a single redshift per object. Imperfections 
in our calibration of the spectra or our subtraction of sky emission or Galactic foregrounds will 
appear as cosmic structure. 

For the Lya forest, we measure the absorption by assuming that the quasar continuum is 
intrinsically smooth. However, even an unabsorbed spectrum would have variations due to the 
intrinsic spectrum of the quasar and any errors in the removal of the sky emissions or flat-fielding 
of the detector. We do not know the detailed unabsorbed quasar spectrum but instead need to 
estimate it from the ensemble properties of quasars or from fitting to less absorbed pixels. The BAO 
signal is a very weak modulation on large scales. Modeling errors far too weak to show up in any one 
spectrum could inject correlations that bias the BAO scale or simply increase the noise far above 
the expected sample variance. The i nost detailed dis cussions of systematic e r rors i n large-scale 
Lya forest measurements are those of McDonald et al.l (j2006 ) and Slosar et al. ( 2011 ). Measuring 
the scale of the BAO feature again appears much easier than determining the broad-band shape 
or absolute amplitude of the power spectrum. Studies to date have not identified observational 
problems that would prevent high precision BAO measurements, but the field is in its early days. 

For 21 cm intensity mapping, we are looking for the correlations of the extragalactic line emission 
as a function of wavelength (redshift) and sky position. However, the Milky Way is emitting 
synchrotron and free- free emiss ion three orders of magn itude higher than the extragalactic signal 



and highly variable on the sky (jChang et al.l . bonsl . hoi& i . Fortunately these emission mechanisms 



are smooth as a function of frequency, unlike the cosm ological signal where fr equency maps to 
redshift, which should enable foreground removal (e.g., Liu and Tegmark 201 ll ). The challenge 
here is primarily instrumental: undesired features in radio interferometers such as far sidelobes 
or standing waves in the antenna are strongly frequency dependent and can mimic a cosmological 
signal if not suppressed. Moreover, the Galactic synchrotron radiation is polarized, and Faraday 
rotation within the ga laxy can lead to strongly wavelength-dependent polarization amplitude (e.g., 
Haver korn et al. 20031 ) , so the instrument and software must measure the total intensity and remove 
Stokes Q and U from their maps — a major challenge given that radio antennas are inherently 
polarized. The problems are similar to those of the 21 cm mapping of the epoch of reionization, 
where several experiments are trying to achieve first detections. Projects aimed at z = 1 are being 
started in order to investigate and hopefully control the observational systematics. 



4-5.2. Astrophysical Systematics 

Astrophysical systematic errors are principally due to non-linear structure formation, redshift 
distortions, and galaxy clustering bias. These were discussed in §4.3.21 Our understanding of these 
effects in cold dark matter models has been greatly advanced by numerical simulations and analytic 
theory over the past decades. Fortunately, the acoustic scale is much larger than the scales of non- 
linear structure formation and the hydrodynamic effects on galaxy formation. Gravitational forces 
are by far the dominant effect on 150 Mpc scales, and we can compute these at high accuracy. 

Galaxy clustering bias based on halo occupations has been shown to be manageable for the BAO 
method. The raw shifts of the acoustic scale are below 1% , and they can be reduced below any 



reasonable detection limit with reconstruction (jMehta et al.l . l201ll ). as shown in Figure [T2l Hence, 
the concern is now for a more complicated clustering bias, e.g., one that couples more directly 
to the large- s cale d ensity field or that features large-scale cooperative effects between galaxies 



( Bower et al.l . Il993l ). But clustering bias is no longer an arbitrary bogeyman. We have many 
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observational probes that should test a bias model: two-point and higher-point clustering over 
all scales, redshift distortion patterns, cross-correlations between types of galaxies, galaxy-galaxy 
weak lensing maps, and various measures of halo ma sses. While the s i mplest formulation of HOD 
is surely not the whole story of clustering bias (e.g., Gao et al. . 20051 : Gao and White . 20071 ). the 
model has passed significant tests. An alternative mechanism that couples to large-scale densities 
in a very different manner so as to alter the BAO scale will almost certainly produce far more 
detectable effects on smaller scales. 



A possi ble complication to galaxy biasing at the BAO scale was pointed out by lTseliakhovich and Hirata 

( 201ol ) and Yoo et al. ( 2011 ). At the time of recombination, the pressure of the photons causes the 
baryonic matter to have a relative velocity compared to the dark matter, with a typical value 
f^bc ~ 30 km s~^. This relative velocity is largely due to the same standing acoustic waves that 
produce the BAO feature; it is coherent on scales of a few Mpc and has a feature in its correlation 
function at 150 Mpc. After recombi nation, the sound speed i n the baryons drops to 6km s~^, 
so the relative velocity is supersonic. Tseliakhovich and Hirata ( 20ld ) argue that this boosts the 
effective Jeans mass as small dark matter structures fail to retain baryons, thereby suppressing the 
formation of the earliest galaxies (Mhaio ~ 10^ Mq at z > 10). This level of suppression depends on 
the local Ubc, which varies on large scales. It is unclear whether this varying suppression causes a 
detectable imprint on the properties of much more massive galaxies at low redshift; it may be that 
it is completely erased as galaxy-mass (> IO^^Mq) halos form and wipe out the small-scale initial 
conditions, or it may be that feedback mechanisms such as early metal pollution allow some trace 
of Vbc modulation to survive in structures at z ^ 10. In the latter case, it represents a potentially 
serious concern b ecause (unlike other systematic errors) the modulation co ntains the BAO scale 
( Yoo et al. . 2011 ). However, the form of the modulation is predictable, and Yoo et al. ( 20 111 ) find 
that the measurements of the galaxy bispectrum would enable the detection and removal of this 
effect. 

Besides gravity, the only physical effect that we reasonably suspect can modulate galaxy prop- 
erties on large scales is radiation transport. For example, it is predicted that reionization pro- 
ceeds with bubbles of scales of 10 Mpc for hydrogen and 100 Mpc for He II. This may affect the 



late-time galaxy density field i n non-gravit a ,tiona l ways (McQuinn et al. . 2007 : Iliev et al. . 20081 : 



Mesinger and Furlanettd . l2008l : IZheng et al.l . l201ll : IWyithe and Dijkstral . l201ll ). However, the scale 
of the reionization bubbles is not sharp enough to mimic the acoustic peak, e.g., any reasonable 
variation in the luminosity of the ionizing sources will produce a wide spread of bubble sizes. 
Reionization effects could be a larger issue for the Lya forest than for galaxy surveys because one 
is mapping the IGM directly. Simulations of Gpc^ volumes that incorporate models of these reion- 
ization effects will be needed to see whether they can detectably influence BAO measurements, 
but the absence of a sharply preferred scale in reionization should again provide protection if one 
marginalizes over broad band tilts. 

4-5.3. Cosmological Systematics 

Cosmological effects that alter the sound horizon or the detailed prediction of the linear power 
spectrum must contend with the fact that the z = 1000 universe and the acoustic oscillations 
themselves are exquisitely well observed in the CMB anisotropies. For example, a change in the re- 
combination history could alter the sound horizon, but this produces correspon d ingly larger changes 



in th e damping tail of the primary anisotropies (jEisenstein and Whitd . 120041 : Ide Bernardis et al. 
20091 ). Effects such as particle decays that change the expansion history so as to alter the sound 



horizon affect the gravitational potential of the fluctuations and have large impact on the CMB 
anisotropies. 

Models that combine adiabatic perturbations with smaller isocurvature ones offer additional 
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degrees of freedom to constrain in the CMB. Most such combinations yield differences that can be 
detected i n the acoustic peak s tructure of the CMB before they affect late-time BAO inferences. 
However, iMangiUi et"aH (|20ld ) show that a particular combination of isocurvature modes may 



exist that can change the sound horizon by a moderate amount before the CMB anisotropics are 
observably altered. This poss ibility bears niore iiivestigation, e.g. , of other late-time observ able 
consequences of such a model ( Mangilli et al. . 20ld : Car bone et al. . 2011 : Zunckel et al. . 201 ll ). 



Finally, we note that if the sound horizon or power spectrum template predicted from z = 1000 
is wrong, then the effect on the BAO distance scale will typically be a multiplicative error across 
all redshifts. This would alter the inference of w{^z\ but with a particular redshift dependence that 
one might choose to be suspicious of if one found it. 

In short, while not all cosmological possibilities have been cataloged for their effect on the BAO 
method, one should always judge such possibilities in light of the CMB as well. The combination 
of CMB and BAO is likely to be self-diagnosing of new cosmological physics at high redshift. 
There may be exotica that can slip through this net, but we don't view this potential confusion 
with dark energy dynamics as a demerit of the method. Large cosmological surveys offer a rich 
spectrum of possible analyses with which to corroborate our model of structure formation, and the 
discovery of any discrepancy from vanilla ACDM will surely inspire a vigorous search for alternative 
explanations. 

Space vs. Ground 

The principal challenge of the BAO method is obtaining the redshifts of millions of faint galaxies. 
Certainly we can obtain redshifts from the ground for tracers at any redshift; the difficulty is in 
doing this quickly and cheaply enough. 

Most BAO work to date has used multi-fiber spectroscopic surveys at optical wavelengths. This 
is practical for surveys of order 10^ galaxies. At z > 1, one relies on finding very luminous line 
emitters, and the desired number of galaxies to reach the cosmic variance limit is of order 10^. 
Routing optical fiber to 10® objects is technically very demanding. We expect that fiber-fed optical 
galaxy redshift surveys will do an excellent job out to z = 1 and will make a start at 1 < 2; < 1.5, 
but will not approach the cosmic variance limit at 2; > 1. 

Photometric redshifts of either galaxies or clusters are an option to sample a large volume at 
z ~ 1 with upcoming surveys and probably at higher redshifts with deeper surveys like LSST. 
Redshifts z < 0.7 will be done better with funded spectroscopic programs. One is free to pick a 
subset of galaxies on which one has better photometric redshift performance. However, photometric 
redshifts are best when relying on strong breaks, notably the 4000A and Lyman break. The former 
requires near-IR data at z > 1; the latter requires space UV data at 2; < 3. As mentioned above, 
photometric redshifts are not precise enough to capture the BAO H{z) information, which is a 
large loss at higher redshifts. We expect that the upcoming generation of imaging surveys will be 
the first to map the BAO at z ~ 1 over large areas of the sky. This will achieve an important 
constraint on Da{z). Later spectroscopic surveys will improve the Dji{z) measurement and add 
H{z). 



A space mission offers the opportunity for slitless spectroscopy (iGlazebrook et al.l . l2005l ) . This 
efficiently finds the strongest line emitters over a wide instantaneous field. Slitless spectroscopy of 
faint objects is only practical in space, where the foreground (or "sky") emission is low. This is 
particularly attractive in the near-IR, where the zodiacal background light is low and the Ha line 
from z > 1 galaxies is very bright. The UV with Lya is another opportunity. 

At z > 2, the Lya line (whether in emission or in the forest) can penetrate the atmosphere. This 
offers a renewed opportunity for ground-based work, but only the Lya forest is likely to be able to 
approach the cosmic variance limit in the foreseeable future. As described above, this method still 
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has significant uncertainties about its observational and thoeretical systematics. Galaxy samples 
would again require > 10^ objects to reach the cosmic variance limit, a factor of 100 more than 
planned surveys. The Lya forest gets undesirably thick at z > 3.5, and B AQ surveys above th is 
redshift might require a space mission, such as the Cosmic Inflation Probe ( Melnick et al. . 20091 ). 



A 21 cm facility such as the SKA capable of detecting individual high-redshift galaxies is a 
multi-billion dollar project and hence well in the future, albeit with a large cosmological payoff. 
We note that not all technical implementations of the SKA permit full-sky mapping, and keeping 
this option does increase the cost of the correlator. The 21 cm intensity mapping technique is con- 
siderably cheaper, but we do not know whether it can achieve the required control of observational 
systematics. Applying intensity mapping to the reioniza t ion e poch could eventually measure the 
distance scale at z > 6 (|Mao and Wul ! I2OO8I : iRhook et al.l . I2OO9I ). 



A space mission that would target of order 10 1 < z < 2 galaxies is the only robust near- 
term path to approaching the cosmic cosmic variance limit for BAO over the enormous comoving 
volume available in this redshift range. Intensity mapping is an attractive opportunity, but it 
needs substantially more development before it can be realistically assessed. Ground-based galaxy 
redshift surveys and Lya forest surveys will explore z > 2, though in the near-term approaching 
the cosmic variance limit depends on controlling systematic errors in the Lya forest method, which 
are not yet understood at the percent or sub-percent level. 

4.7. Prospects 

In contrast to essentially all of the other observational probes that we consider in this review, we 
anticipate that even the most ambitious BAO studies will remain limited by statistical errors rather 
than systematic errors. This assumption could prove incorrect, either because we are overoptimistic 
about BAO systematics or because we are too pessimistic about other methods. But it does imply a 
natural long-term target for BAO investigations of cosmic acceleration: survey a large fraction of the 
entire comoving volume out to z « 3.5, beyond which the sensitivity to dark energy begins to decline 
(Table [2]) , with high enough sampling density that the BAO measurements are limited by sample 
variance rather than shot noise. No one survey will reach this goal on its own; rather, a variety of 
projects can gradually map out the available volume by using different facilities and techniques to 
target different redshift ranges and areas of sky. Surveys that cover the same redshift range with 
the same technique are not redundant unless they cover the same region of the sky. To zeroth order, 
the primary metric for a BAO survey is the comoving volume that is covered at adequate sampling 
density, and it makes sense to choose redshift ranges according to observational convenience (though 
of course one can further optimize both survey strategy and instrument design). Relative to the 
current state of the art described in § 14.21 — roughly speaking, analyses that have probed /sky = 0.4 
to z = 0.15, /sky = 0.25 to z = 0.45, and /sky = 0.02 to z = 0.8 — BAO surveys have tremendous 
possibility for growth, with correspondingly great opportunities for improved precision and redshift 
leverage on Da{z) and H{z) (Fig. 



With the completion of WiggleZ (iBlake et al.l . I2OIII ). the only spectroscopic B AO survey cur- 



rently operating is SDSS-III BOSS (P.I. D. Schlegel), described in ^4.2l and further in lEisenstein et al 



(|201lh . BOSS has just completed its second of five years of data taking and will conclude observa- 



tions in mid-2014. Forecasts for the galaxy survey predict 1.0% precision on Da at z = 0.35 and 
z = 0.6 and 1.8% precision on H{z) at these redshifts. The Lya forest survey is expected to yield 
4.5% precision on Da and 2.6% on H{z) at z = 2.5. BOSS will provide a solid BAO anchor at low 
redshift, the first BAO measurements z > 2, and the first practical test of the Lya forest technique. 

The Hobby-Eberly Telescope Dark Energy Experiment (HETDEX) is largely funded and cur- 
rently under construction. HETDEX plans a survey of 800,000 Lya emission-line galaxies over 
420 square degrees at redshifts 1.8 < z < 3.7, using a blind-pointing strategy with a large set of 

66 



integral- field spectrographs ( Hill et al. . 20061 ). The forecast precision on Da and H is of order 2% 
from BAO alone, with additional gain possible if one can take advantage of the increased linearity 
of the large-scale density field at high redshift to model the full anisotropic clustering signal of the 
galaxies. 

PanSTARRS and DES are two near-term imaging surveys with the depth and area needed to 
probe BAO at z ~ 1. BAO analyses will likely focus on red galaxies as they afford more robust 
photometric redshifts and the two cameras employ red-sensitive detectors that achieve good depth 
in the z and y bands. These projects will likely yield the first strong BAO constraints aX z = 1. 

A multitude of more ambitious projects are being planned. On the imaging front, LSST should 
eventually yield an enormous sample of galaxies with good photometric redshifts, enabling photo- 
z BAO studies to reach to z = 2 and beyond. A near-term Spanish project called Physics of 
the Accelerating Universe (PAU) aims to do shallower imaging with many medium-band filters, 
designed to achieve high enough red s hift p recision to recover H{z) information out to z ~ 1 
( Bemtez et al. . 20091 : Gaztanaga et al. . 20 111 ). This medium-band strategy is intermediate between 
photometric and spectroscopic approaches. 

Returning to spectroscopy, eBOSS, part of a proposed (but not yet funded) program of post- 
2014 surveys on the Sloan 2.5-meter telescope, would extend the BOSS survey in several directions, 
using higher redshift LRGs (to z = 0.8), emission line galaxies, and quasars, including a denser set of 
z > 2 quasars to improve measurements from the Lya forest. eBOSS would cover 1 500 — 3000 deg ^ 



depe nding on strategy details that are still to be decided. The BigBOSS experiment (jSchlegel et al. 



20 111 ) would use spectrographs fed by 5000 optical fibers over a 3-degree field on the Mayall 4-meter 
telescope at Kitt Peak to survey 14,000 deg^. For its five-year primary survey, BigBOSS would 
target luminous red galaxies to z = 1 and emission line galaxies to z = 1.7, more than 10 million 
galaxies in total, with sampling density nP > 1 out to z f« 1 — 1.2. BigBOSS would target high 
redshift quasars with a high enough density to approach the sample variance limit for the Lya 
forest method at 2 < z < 3. The BigBOSS instrument could in principle be moved to the Blanco 
telescope at CTIO to conduct a similar survey of the southern hemisphere. Alternatively, the 
DES collaboration has considered a 4000-fiber instrument (DEspec) that would use the DEcam 
optical corrector on the Blanco; this instrument could pursue a similar galaxy redshift survey 
but would not (in its current design) have the blue wavelength coverage needed to map the Lya 
forest. The SuMIRe project proposed for the Subaru 8-meter telescope would use optical/IR prime 
focus spectrographs fed by 2400 fibers to carry out a large galaxy redshift survey, mapping BAO 
in the redshift range 0.7 < z < 2.4. The current baseline program would survey 4 million [OII]- 
emitting galaxies over 1420 deg^. Collectively, these ground-based optical/IR projects could cover a 
substantial fraction of the sky with fully sampled galaxy surveys to z ~ 1.2, provide interesting BAO 
measurements with lower sampling densities to z k. 1.7, and possibly measure BAO to something 
approaching the cosmic variance limit at z = 2 — 3 using the Lya forest. 

Both Euclid and WFIRST plan large BAO surveys as major components of their dark energy 
science programs, using slitless near-IR spectroscopy to measure redshifts of strong Ha emitters 
in the redshift ra nge 0.7 < z < 2.0. Current incarnations of these plans are described in the 
Euclid Red Book 



laureijs et al. . 201 ll ) and the WFIRST Science Definition Team's interim report 



(| Green et al.l . l201ll ). though technical specifications and survey strategies are likely to evolve to 
some degree prior to launch. The present baseline strategies involve survey areas of approximately 
15,000 deg^, reaching the sampl e variance limit at the low end of the redshift range but not at 
the high end. Green et al. (120111 ; see their Figure 15) attempt a side-by-side forecast of BigBOSS, 
Euclid, and WFIRST BAO performance, using common modeling assumptions and each survey's 
stated estimates of fiux limits or galaxy yields. For Euclid, they predict a fractional error on II{z) 
of approximately 1.5% per Az = 0.1 bin from z = 0.7 to z = 1.5, rising to (Th/H ~ 3% by z = 1.8. 
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For WFIRST, they predict an/H = 1.2 - 1.8% per Az = 0.1 bin over the fuh range 0.7 < z < 2.0, 
with error below 1.4% over z = 0.9 — 1.8. The predicted WFIRST performance begins to exceed 
BigBOSS performance at z > 1.2, where the low near-IR sky background from space becomes a 
major advantage. These numbers should be taken with a grain of salt, as they depend on uncertain 
hardware and software performance and on details of survey strategy. By the time these missions 
are launched, results from earlier dark energy experiments or developments in modeling techniques 
could well favor alternative strategies, e.g., with deeper sampling but smaller sky area. Furthermore, 
the Euclid and WFIRST dark energy programs are both limited by observing time, and either could 
be more powerful with a longer mission. It is clear, however, that these missions can dramatically 
improve our knowledge of dark energy evolution at z = 1 — 2. 

Shifting wavelengths, several 21 cm intensity mapping experiments for t he range 0.8 < z < 3 
are being planned. Two differen t techniques are cylindrica l telesc ope arrays (jPeterson et al.l . l2006l ) 
and the FFT-based Omniscope (jTegmark and Zaldarriagal . l20ld ^. An example of the former is the 
CHIME project, which aims to build a 100 meter square filled interferometer and conduct a lengthy 
survey at 0.8 < z < 2.5. Baselines of 100 meters do not yield sufficient angular resolution to extract 
all of the BAO information, but if the foregrounds can be adequately controlled, CHIME would be 
a powerful demonstrator of the 21 cm method and would yield excellent cosmological information. 
Moving beyond intensity mapping, the SKA could enable an Hl-reds hift survey of a billion ga,laxies , 
reaching the sample variance limit over half the sky out to z = 3 ( Abdalla and Rawlings . 20051 ). 
which would be a good approximation to the ultimate BAO experiment. 
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5. Weak Lensing 



The subtle distortion of shapes of distant galaxies by gravitational lensing is a powerful probe 
of both the mass distribution and the global geometry of the universe. It has, however, turned 
out to be one of the most technically difficult of the cosmological probes. This section will cover 
the range of applications of weak lensing (which we will sometimes abbreviate to WL), the recent 
and planned weak lensing surveys, and the technical aspects of weak lensing image processing and 
control of systematics. By covering the latter subjects in some detail (including some methods that 
we think have been under-appreciated or under-utilized), we hope to stimulate further progress and 
be helpful to readers who are already experts in weak lensing. 

This section is organized as follows: we begin with a qualitative overview of weak lensing and 
its uses ( §5.ip . We then go into a mathematical treatment of the various statistics that can be used 
and their dependences on the background cosmology and matter power spectrum ( ^5.2p . We then 
review the observational results from recent weak lensing surveys ( §5.3p . Next we move into more 
technical descriptions of survey design including source redshift estimation and weak lensing outside 
the optical/near-IR bandpass ( §5.4j) . the measurement of galaxy shapes ( §5.5j) . and astrophysical 
uncertainties ( ^5.6p . We summarize the major systematic errors and mitigation strategies ( ^5.7p . 
We finally consider the advantages of a space mission for weak lensing ( §5.8p and prospects for the 
future ( ^5M . 

Some of the material in this section is technical and in a first reading may be either skipped or 
skimmed; but given that so much of the promise of weak lensing depends on these issues, we felt 
compelled to include them. The more technical sections have been den oted with an asteris k (*). 
They may be thought of as analogous to, e.g., the "Track 2" material in lMisner et al.l (jl973l ). 

5.1. General principles: Overview 

The images of distant galaxies that we see are distorted by gravitational lensing by foreground 
structures. In rare cases, such as behind clusters, one observes strong lensing: the deflection of light 
by massive structures can result in multiple images of the same background galaxy. More often, 
however, images of galaxies are subjected only to weak lensing: a small distortion of their size and 
shape, typically of the order of 1%. Since one does not know the intrinsic size or shape of a given 
galaxy, weak lensing can only be measured statistically by examining the correlations of shapes in 
deep and wide sky surveys. However, the payoff if these statistical correlations can be measured is 
enormous: weak lensing provides a direct measure of the distribution of matter, independent of any 
assumptions about galaxy biasing. Since this distribution can be predicted theoretically, even in 
the quasilinear regime, and since its amplitude can be directly used to constrain cosmology (unlike 
for galaxy surveys where one must marginalize over the bias), weak lensing has great potential as 
a cosmological probe. 

In principle, one may attempt to observe either the shearing of galaxies (shape distortion) or 
their magnification (size distortion). In practice, the shape distortions have been used much more 
widely, since the mean shape of galaxies is known (they are statistically round: as many galaxies 
are elongated on the x-axis as on the y-axis) and the scatter in their shapes is less than the scatter 
in their sizes. 

A variety of statistical approaches have been used to extract information from weak lensing 
shear. The simplest is the angular shear correlation function, or its Fourier transform, the shear 
power spectrum. These are related to integrals over the matter power spectrum along the line of 
sight, and as such in the linear regime at low redshift they scale as oc r2^cTg0 Since the angular 



Warning: these scalings are altered even at modest redshift, or in the nonlinear regime where the exponent of erg 
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power spectrum is rather featureless, more information can be extracted via tomography — the 
measurement of the shear correlation function as a function of the redshifts of the galaxies observed, 
including the use of cross-correlations between redshift slices. Information on the relation between 
galaxies and matter can be obtained via galaxy-galaxy lensing, i.e., the correlation of the density 
field of nearby galaxies with the lensing shear measured on more distant galaxies. In the linear 
regime, the galaxy-galaxy lensing signal scales as oc bQm(TQ and thus provides information on the 
bias, while in the nonlinear regime it probes individual galaxy haloes and hence places constraints 
on the halo occupation distribution ( §2.3p . Combination of this with the galaxy clustering signal 
(which scales as oc 6^o"g) enables one to eliminate the bias and measure ri^fs- The scaling of the 
galaxy-galaxy lensing signal as a function of the source redshift, known as cosmography, depends 
purely on geometric factors and hence can be used to partiall}@ construct a distance-redshift 
relation. Finally, the low-redshift matter distribution is non-Gaussian, so higher-order statistics 
such as the bispectrum or 3-point shear correlation function carry additional information. 

For all of the applications of weak lensing to cosmology, deep wide-field imaging is essential. One 
can see this from a simple order-of-magnitude estimate. For a scatter in galaxy shapes of o"^ ~ 0.2, 
measuring a 1% shear with unit signal-to-noise ratio requires ~ 500 galaxies (0.25/\/500 ~ 0.01). 
Measuring the amplitude of density perturbations to 1% accuracy requires that this be done over 
~ 10^ patches of sky, giving a requirement of order 10^ galaxies, which for a density of 15 resolved 
galaxies per arcmin^ amounts to surveying 200 deg^ of sky. This is the scale of the largest current 
surveys such as CFHTLS; in practice the errors from these surveys are likely to be closer to several 
percent due to "factors of a few" that we have dropped here, and due to the inclusion of systematic 
errors. The eventual goal of the weak lensing community is one or more "Stage IV" surveys (such 
as LSST on the ground and Euclid and WFIRST in space) that would measure shapes of ~ 10^ 
galaxies and achieve an additional order of magnitude in precision. Such surveys will have to face 
the daunting task of reducing systematic errors by another order of magnitude. 

There are unfortunately many sources of these systematic errors, and most of the effort of the 
weak lensing community has been devoted to defeating them. One is the measurement of galaxy 
shapes: while gravitational lensing by a large-scale density perturbation can coherently align the 
images of many galaxies, this can also arise from shaking of the telescope or optical aberrations. 
The accurate determination of the point-spread function (PSF) of the telescope (usually based on 
observations of stars) and removal of its effects is thus critical. This problem gets much worse if one 
tries to model galaxies with sizes similar to or smaller than the PSF. High-resolution, stable imaging 
can help with this problem, motivating placement of future instruments at the best ground-based 
sites or in space. The determination of redshifts for the large number of source galaxies is also a 
concern. It is not practical to obtain a robust spectroscopic redshift of every galaxy, and hence 
"photometric redshifts" — estimates of galaxies' redshifts based on their broadband colors — are 
used. These must be calibrated with well-known biases, scatters, and outlier distributions. Fi- 
nally, there are astrophysical uncertainties: galaxies can suffer "intrinsic alignments" (non-random 
orientations), and the matter power spectrum may deviate from pure CDM simulations at small 
scales. Much of our discussion here will be focused on the methodologies that have been developed 
to suppress systematics at each stage of the observations and analysis. 



becomes closer to 3. 

^*The cosmography distance scale suffers from three degeneracies, including the absolute-scale degeneracy that 
affects supernova measurements; see ^5.2.7\ 
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5.2. Weak lensing principles: Mathematical discussion 

We will now go into greater detail on the mathematical aspects of weak lensing, both the 
construction of the weak lensing field and the various statistics that one can extr act from it. Th e 



moder n theoretical forn i alism of we ak lensing trac es back largely to the papers of lBlandford et al 



(Il99lh . iMiralda-Escudi (Il99ll ). and iKaiserl (Il992l'l. th ough one can find roots in the much earlier 
papers of Kristian and Sachs ( 19661 ) and Gunn ( 1967 ). 



5.2.1. Deflection of light in cosmology 

Gravitational lensing gives a mapping from the intrinsic, unlensed image of the sources of light 
on the sky — the source plane — to the actual observable sky — the image plane. Our ultimate 
goal is to extract information about the statistics and redshift dependence of this mapping and use 
it to constrain cosmological parameters. Our task here is thus two- fold. First, we must derive the 
mapping function that relates the source to the image plane. However, since we do not know the 
intrinsic appearance of the sources, we cannot directly infer the lens mapping from observations. 
Therefore, our second task will be to determine what properties of the lens map can be measured, 
and with what accuracy. 

In a fully general context, the lens mapping can be obtained by taking an observer and following 
the geodesies corresponding to that observer's past light cone. We will make some simplifying 
approximations here, namely that: (i) the spacetime is described by a Friedmann- Robertson- Walker 
metric with scalar perturbations and negligible anisotropic stresses (appropriate for nonrelativistic 
matter, scalar fields, and A); (ii) deflection angles are sufficiently small that we may use the flat- 
sky approximation; (iii) the evolution of perturbations is slow enough that we may neglect time 
derivatives of the gravitational potential $ in comparison to spatial derivatives (i.e., nonrelativistic 
motion); and (iv) such perturbations are small enough that we may compute the lens mapping only 
to first order in perturbation theory^ Within these approximations, we may write the angular 
coordinates (^1,6*2) of a light ray projected back to comoving distance Dq (see eq. [7| in terms of 
the position (^(,^2) ™ image plane aj^ 



ei{Dc) = 9l-2 g{Dci,Dc)-^[Dci,9,{Dci)]dDci, 



where Q is the Green's function. 



rDc 

g{Dci,Dc) = / [DA{Dc2)r^dDc2 = cot/^Z^ci) - coiK{Dc). 

JDci 



(56) 



(57) 



Here cotxiDc) is the cotangentlike function. 



D 



-1 
c 



flat 

cotK{Dc) = { K-^/'^coi{K^/^Dc) closed 
|ir|-i/2coth(|K|i/2Dc) open. 



(58) 



with the dimensional curvature K defined in equation 
potential. 



and $ is the Newtonian gravitational 



^^These approximations are sufficient to analyze present power spectrum data, but corrections to (iv) will become 
necessary in the future. 

■^''The derivation of equation (1561) can be found in many works, though not always in the same notat i on. See , e.g., 
eq. (6.9) in the classic review by Bartelmann and Schneideil (|200ll ). The appendix of lHirata and SeliakI l|2003ah gives 
a shorter derivation in more similar notation. 
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The potential derivative in equation (j56p is evaluated at the position of the deflected ray 
6j{Dci), so it represents an implicit solution to the light deflection problem. However, in lin- 
ear perturbation theory (see our assumption iv above), we may evaluate it at the position of the 
undeflected ray. This is known as the Born approximation. When we do this, it is permissible to 
pull the angular derivative out of the integral and write 

+ (5.) 

where ip is the lensing potential: 

i-Dc 

i^{Dc, 9i) = -2 / [cotKiDci) - cot K{Dc)]^{Dcu9i) dDci. (60) 

JQ 

Here it is important to remember that Dq represents the distance to the sources; one integrates 
over lens distances Dci- 

Equation (pUj) provides the mapping from the observed image plane to the source plane, 0^{6\). 
In what follows, we will assume that this mapping is one-to-one: this is known as the regime of 
weak lensing. In the small portion of sky covered by very massive objects, the alternate regime 
of strong lensing occurs, in which several points in the image plane map to the same point in the 
source plane. Strong lensing is an important probe of the matter distribution in clusters, but we 
will not pursue it in this article. 

5.2.2. Cosmic shear, magnification, and flexion 

We have now accomplished our flrst task: deriving the lens mapping from the matter distribu- 
tion. However, we now need a way to classify the observables in the lens mapping. The potential 
tjj is of course not observable itself: like the Newtonian gravitational potential, its zero-level is 
arbitrary. Its angular derivative dip/dOi is the deflection angle: the difference between the true 
position of a source 6f and its apparent position 6]. However, since sources (in practice, galaxies) 
can be at any position, we cannot measure the deflection angle either. 

Let us now consider the second derivative of the lensing potential. It is simply the Jacobian of 
the mapping from image to source plane: 



d9f , 1 



We have separated the 3 independent entries in the symmetric 2x2 matrix of partial derivatives into 
3 components: the magnification k and the 2 components of shear, 7_|_ and 7x. The magnification 
has three effects: 

• It makes the angular size of a galaxy look larger by a factor of 1 -|- k. 

• It makes the galaxy appear brighter by a factor of the inverse-determinant of the Jacobian, 
1 -|- 2k, since lensing conserves surface brightness as dictated by Liouville's theorem. 

• It dilutes the number density of galaxies by a factor of 1 — 2k, since the angular spacing 
between neighboring galaxies is increased by a factor 1 + k. 

Magnification is a "scalar" in the sense that it is invariant under rotations of the (^1,^2) coordinate 
axes. 
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(62) 

'X /old 



The shear stretches the galaxy along one axis and squeezes on the other: the image of an 
intrinsically round galaxy appears elongated along the 9i axis if 7+ > and along the 82 axis if 
7+ < 0. The 7x component stretches and squeezes along the diagonal (45°) axes. The shear is a 
"spin-2 tensor" in the sense that under a counterclockwise rotation of the coordinate axes by angle 
(5, it transforms as 

7+ \ _ / cos 26 sin 25 \ f 7 
/new ~ V -sin25 COS25 j V 7 
If all galaxies were round, then each galaxy would provide a direct estimate of the shear, since 
we could find the values of (7+,7x) that transformed an initially circular galaxy into the observed 
image. In reality, galaxies come in many shapes, and any such estimate of the shear components 
will have some standard deviation known as the shape noise. But in an ensemble average sense 
galaxies are round — there are as many galaxies in the universe elongated along the 61 axis as 
the O2 axis. Thus, if we take galaxies in the same region of sky, we may expect that the shear 
components in that region can be measured with a standard deviation of ~ a^/^/N. 

Several caveats are in order at this point, and they form the basis for most of the technical 
problems in weak lensing. One is that a circular galaxy re-mapped by the Jacobian (eq. [6T]) 
becomes an ellipse, but since in the real sky one does not observe a population of galaxies with 
homologous elliptical isophotes, there is no unique procedure to estimate the shear. Moreover, real 
telescopes, even in space, have finite resolution, and the observed image is convolved with a PSF 
that smears the galaxy and may introduce spurious elongation on some axis. These two problems 
together are referred to as the shape measurement problem. A more fundamental issue is that real 
galaxies are not randomly oriented: they have preferred directions of orientation that are correlated 
with each other and with large-scale structure, and thus contaminate statistical measures of the 
cosmic shear field. This is known as the intrinsic alignment problem. We will discuss all of these 
problems in § §5.41(5171 

Measuring magnification k has proven more difficult than measuring shear. One might imagine 
comparing the size, magnitude, or abundance of galaxies in some region of sky to a typical or 
"reference" value, but there is a very wide dispersion in galaxy sizes and magnitudes, and since 
some galaxies are too faint to observe even in deep surveys one cannot measure such a thing as the 
total number of galaxies. Rather, one can measure the cumulative number of galaxies brighter than 
some flux threshold, A^(> F). If the number counts have a power-law slope a, i.e. A^(> F) oc F~°', 
then magnification will perturb this distribution by a factor 

iV(> F, observed) oc [1 + 2(a - 1)k]F-". (63) 

There are two competing effects here: in regions of higher magnification the galaxies appear 
brighter, which gives the 2aK factor in equation (j63p . but there is also the dilution of galaxy 
number, which is responsible for the "—1" term. Unfortunately, for optical galaxies the observed 
number count slope is close to the critical value a ~ 1 for which magnification is not measur- 
able. Moreover, the intrinsic clustering of galaxies gives large fiuctuations in the number density 
that greatly exceed those due to lensing effects. For these reasons, magnification ha s lagged behind 



shear as a cosmological probe, and the cosmic magnification signal was not seen until IScranton et al 



(|2005l ) measured it using cross-correlation of foreground galaxies and background quasars. 

The most promising route to utilizing the cosmic magnification signal is to use scaling relations 
that relate the size of a galaxy (as quantified by, e.g., the half-light radi us) to parameters that 



are m agnification-independent and can be measured in photometric surveys (jBertin and 



20061') . such as the surface brightness, the Sersic index, or (for AGN) variability amplitude. 



iombardi 



Huff and Graves! 



(12011) present a first application of this "photometric magnification" method to galaxies, and 



Bauer et al.l (1201 ll ) an application to quasars. 
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After shear and magnification comes the third derivative of the potential, i.e. the variation of 
shear and convergence across a galaxy. This effect is called the flexion, and it manifests itself via 



asymmetric banana and triangle-like distortions of an initially circular galaxy (jGoldberg and Baconl . 



Flexion has been measured by several groups. However, because of the extra derivative it is 
sensitive mainly to structure at the very smallest scales, so it is primarily a tool for cluster lensing 
rather than cosmological applications on larger scales. 

5.2.3. Power spectra and correlation functions* 

Just as for any other random field in cosmology, one may construct statistics for the cosmic 
shear field. The most popular are the power spectrum and its real-space equivalent, the correlation 
function. 

To construct the power spectrum, we take the Fourier transform of the shear field, 

7+,x(l) = J ^+AG)e-''-'d'0 o 7+,x(0) = 1 7+,x(l)e^'-''^ . (64) 

It is convenient to rotate the Fourier-space components from the coordinate axis basis to a basis 
aligned with the direction of the wavevector, which is actually a preferred direction in the problem. 
The rotated components are called the E-mode and B-mode: 

jE{i) = cos(2(/)i)7+(l) + sin(2</.i)7x (1) and 7b(1) = cos(2,^i)7x (1) - sin(2</.i)7+(l), (65) 

where tan cpi = h/h- Thus the £^-mode of the shear field corresponds to galaxies that are stretched 
and squashed in the direction of the wave vector and perpendicular to it, whereas the i?-mode 
corresponds to stretching and squashing at a 45° angle. One may then define the power spectra: 

(7|;(l)7i^(l')) = {27Tf CEEil)6^'Hl - 1'), (66) 

and similarly for Ceb{^) and Cbb{1)- Rotational symmetry guarantees that these depend only on 
the magnitude of 1 and not its direction, and refiection symmetry guarantees that Ceb{1) = 0. 

In order to compute these power spectra, we need to express the Fourier modes in terms of those 
of the lensing potential. From the definition, equation ()6ip . the shear is seen to be the derivative 
of the defiection angle and hence the second derivative of the lensing potential. 



Using the replacement d/dOi — ?• iZj, we find in Fourier space 

7+(l) = - = ^/'cos(20i)^(l) and 7x(l) = hhm = sH'^cj^M^) ■ 

Substitution into equation ([65]) implies that 

^e{\) = \l'm and 7b(I) = 0. 

We thus arrive at the remarkable conclusion that cosmic shear possesses only an £^-mode; the B- 
mode shear must vanish, and we have Cbb{1) = 0. This is a valuable, though not foolproof, test 
for systematics in WL surveys. 
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The E-mode shear power spectrum is simply (P /2)'^ times the lensing potential power spectrum. 
The latter may be found from the Limber (small-angle) approximation^^! in terms of the Newtonian 
potential power spectrum, yielding 

Cee{1) = I' r\coiK{Dci) - cotK{Dc)f^^^^^^^^dDci. (70) 

•''0 ^Al 

(Here the power spectrum is evaluated at the redshift corresponding to Dci-) We may put this in 
a more familiar form by recalling Poisson's equation, which tells us that the potential and matter 
density perturbations are related by 

2 

k-^Psik) , (71) 



yielding 



Cee{1) = [ ''mDc„Dc)? ^'^''-J^^'^ dDc^ , (72) 
Jo 



where the lensing window functioi^^ i^fl 

W{Dci,Dc) = ^nraHiil + zi)D\^[cotKiDci) - cotK{Dc)]e{Dc - Dci). (73) 

The window function describes the contributions to lensing of sources at Dc from lens structures 
at distance D^i- Note that it vanishes as the lens approaches the source {Dqi — ?• Dq)- In this 
equation, Dai is the comoving angular diameter distance (eq. [9]) to Dci'- in a curved universe 
Dai 7^ Dci- Note that in a flat universe, the window function reduces to 

W,,^,{Dci,Dc) = l^lmHlil + zi) ^^^^^^~^^^^ e(Z)c - Dci). (74) 
2 Uc 

One may also define the angular correlation function of the shear for two galaxies separated 
by angle Since the shear is a tensor, this is more complicated than the correlation function for 
scalars. Without loss of generality, we may rotate the coordinate system so that the galaxies are 
separated along the ^i-axis, and then take the + and x components of the shear. We then define 
the shear correlation functions, 

C7++(^?) = (7+(0)7+(^?)) and x (^) = (Tx (0)7x (^?)). (75) 
As in the scalar case, these are related to the power spectra: 



C++(^?) = (7+(0)7+(^?)) 

f dH f dH' 

^ J (2^ J (^:;;^^^+(^)'^+'^^')^^^p(*^'^^°^'^i') 

[cos\2cI)i)Cee{1) + sm\2cl)i)CBB{l)] exp(«ZT? cos 0i) 

- }Mm±:hmcEEii) + M^W4Mc..(ol ^ , (76) 



\ 2 "^^^^'^ ^ 2 "'^^^'^i 2^ 



^^See lLimberl (Il953l') andlLimbeij l|l954 ) for an introduction to the theory. An exposition in terms of the power 
spectrum is given bv lPeeblesI (|l973l l. 
■^^Warning: many conventions in use! 

'^''The Heaviside step function is technicaUy unnecessary in equation (|73|l . but it is convenient when considering 
multiple populations of sources. 
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where Jq and J4 are spherical Bessel functions. The expression for Cxx is similar, but with Pee and 
Pbb switched. The correlation function {C++(i?), Cx x (f?)}, if measured over all scales, contains 
exactly the same information as the power spectrum {Cee{1), Cbb{1)}, as one can be derived from 
the other. Therefore, the choice of which to measure is usually a technical one based on the ease of 
data processing and handling of covariance matrices. The condition for no i?-modes, Cbb(Z) = V^, 
is more complicated in correlation-function space. 

An infinite number of other second-order statistics (i.e., expectation values containi i ig tw o 
powers of shea r) can be constructed, such as the aperture-mass variance ( Schneider et al. . 19981 ). 



ring statistics ( Schneider and Kilbinger . 20071 ). and finite-interval orthogonal basis decompositions 



(jSchneider et al.l . l20inl ).~ These alternative statistics were introduced because they have useful 



properties from the point of view of data processing or systematics control — e.g. for separation 
of E and B modes, or restriction to a particular range of scales - but all of them are expressible as 
integrals over the power spectrum or correlation function. 

Formulae such as ([72]) and ([76]) may be generalized to the full sky, as was first done for CMB 
polarization, but for cosmic shear most applications involve small angular scales where the flat-sky 
approximation suffices. 

Having built the formalism to describe the statistics of weak lensing, we can now consider the 
proposed ways of using it to measure cosmology. Some methods will depend only on the expansion 
history of the universe, while others are sensitive to the growth of perturbations. 

5.2.4- Method I: Cosmic Shear Power Spectrum* 

The conceptually simplest approach to using WL is to collect a sample of source galaxies, obtain 
an estimator for the shear at each galaxy, measure the correlation function or power spectrum, and 
do a comparison to equation (I72p . Of course not all galaxies are at the same redshift, but there is 
a probability distribution of distances p{Dc), and the observed mean shear in a particular region 
of sky is then 

7+,x(^)=/ p{Dch+,ADc,0)dDc, (77) 





where Dc^ma^x is the comoving distance to the farthest galaxy in the slice. The power spectrum of 
this field can then be written as 

CEEil)= r'''^^\w,^{Dci)f^^^^^^^l^dDci. (78) 

This is similar to equation (j72p with W replaced by an effective window function, 

W,s{Dci)= p{Dc)WiDci,Dc)dDc, (79) 

Jo 

which is simply the usual window function appropriately weighted over the source galaxies. 

The cosmic shear power spectrum Cee{1) is sensitive to many cosmological parameters. Being 
an integral over the matter power spectrum, it is oc cr| in the linear regime, although its behavior 
in the nonlinear regime is closer to oc cig. It also contains two powers of 0^,) so we expect that the 
most important dependences in the problem are that the WL power spectrum scales as ~ 
This is qualitatively correct, but the matter power spectrum and the mapping between Da and 
Dc at finite redshift contain sensitivities to all of the cosmological parameters, and so a full answer 
to the question "what does the shear power spectrum constrain?" requires us to actually do the 
integral to obtain Cee{1)- 



76 



The sensitivity to every parameter is both a virtue of the WL power spectrum and its greatest 
fault: the featureless WL power spectrum contains too many parameter degeneracies. One way 
to break these degeneracies is to combine WL with other probes, as discussed in ^ However, 
there are also ways of using WL that provide additional information and break these degeneracies 
internally, as we now discuss. 

5.2.5. Method II: Power Spectrum Tomography* 

We can improve on the WL power spectrum constraints if we can split the source galaxies into 
redshift slices. In most practical cases, this would be done with photometric redshifts. In this case, 
instead of having a single power spectrum, we have A^(A^ + l)/2 power spectra and cross-spectra; 
if we denote the slices by a, /3 S {1, 2, ...A''}, then these spectra are 



(80) 



where VFefj^Q, is the effective window function for the a slice. Note that because the window functions 
are multiplied, this power spectrum depends only on the matter power spectrum at redshifts closer 
than that of the nearby slice, i.e. at z < viim.{zci,zp}. This makes sense because a given lens 
structure must be in front of both sources to contribute to the shear cross-correlation. The splitting 
of samples by redshift and the use of the redshift scalings to constrain cosmology are known as 
tomography. 

Like the shear power spectrum, the tomographic spectra are sensitive to both the background 
geometry and the growth of structure: the shear power spectrum at / depends on the Dc{z) 
relation, on Ps{k = 1/Da',z) as a function of redshift, and on the curvature With a single 

power spectrum Cee{1) there is no hope of disentangling these functions with WL alone. One 
might hope that having the tomographic cross-spectra as a function of and zp would allow the 
relevant degeneracies to be broken. Unfortunately, such a program runs into three problems: 

• A real WL survey has a maximum source redshift, and there is obviously no sensitivity to 
structures farther than this. 

• There exist exact degeneracies among {Dc{z),Ps{k = 1/ Da', z), K} that lead to exactly the 
same lensing power spectra for all {l,Za,zp). The most obvious of these is the re-scaling 
degeneracy: since lensing measures only dimensionless shears, it cannot me asure the absolut e 
distance scale, only distance ratios. Two other degeneracies are discussed by Bernstein ( 20061 ): 
see also ^5.2.71 

• The broad, smooth nature of the lensing window functions WefT,a(-Cc) implies near-degeneracies 
between power spectra at adjacent redshifts. For example, if one were to test a nonstandard 
cosmology in which Ps{k, z) had a rapid oscillation in z superposed on the expected evolution, 
the rapid oscillation would contribute little to equation (IBOp and would be easily buried by 
statistical or systematic errors. 

Despite these drawbacks, tomographic power spectra have far fewer parameter degeneracies than 
the shear power spectrum alone. More importantly, having A^(A^ -|- l)/2 power spectra provides 
many additional opportunities for internal consistency tests and rejection of systematic errors. 
Some examples of theoretical tomographic power spectra are shown in Fig. [T5l 



^''There is also a factor of QruHg in the window functions, but we will assume this combination has been measured 
accurately from the CMB. 
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Figure 15 The £^-mode shear power spectra predicted for the WMAP 7-year best fit cosmology 
(yirn = 0.265, fjg = 0.8, -ffo = 71.9 km/s/Mpc). The curves show power spectra for sources at 
z = 0.5 (bottom), 1.0, and 2.0 (top). The diagonal line shows the shot noise contribution at a 
source density of Ues = 10 galaxies per arcmin^. 



5.2.6. Method III: Galaxy-galaxy Lensing* 

A third way to use weak lensing is to look not just at the shear power spectrum but at its 
correlation with the distribution of foreground galaxies. This subject is known as galaxy-galaxy 
lensing (GGL), and it is a powerful probe of the relation between dark matter and galaxies. The 
angular cross-power spectrum between the galaxies in one redshift slice a (the "foreground" or 
"lens" slice) and the E-mode shear in a more distant slice /3 (the "background" or "source" slice) 
is defined by 

(C(l)^^(l')) = {27rfc;^{l)6('Hl - 1'), (81) 

where 6g is the 2-dimensional projected galaxy overdensity and 5g is its Fourier transform, and a 
and /3 represent redshift slices. It can be computed via Limber's equation as 

= / Pa{Dci)W,^ADci)^^^^^^^^^ dDci, (82) 

where Pgs{k) is the 3-dimensional galaxy-matter cross-spectrum. The real-space correlation func- 
tion of galaxy density and shear is 

c^im = -fc;^mm^-^- (83) 

In the case where the foreground galaxy slice (a) is narrow - either due to use of spectroscopic 
foregrounds or high-quality photo-zs - the integral in Limber's equation (eq. [82]) becomes a 6- 
function, and the galaxy-matter cross-spectrum can be obtained. 

One can also measure GGL by computing the mean tangential shear (i.e., shear in the direc- 
tion orthogonal to the lens-source vector) of background galaxies around foreground galaxies as 
a function of radius. This view of the measurement is taken in many papers, but it is (almost) 
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mathematically equivalent to correlating the shear field of the background galaxies with the density 
field of the foreground galaxies. 

Prom the perspective of dark energy studies, the principal advantage of GGL over the shear 
power spectrum is observational: the shear is being correlated with galaxies rather than itself. A 
spurious source of shear, e.g. from imperfections in the PSF model, is a source of systematic error 
in the shear power spectrum, but in GGL it is only a source of noise because it is equally likely to 
arise in regions of high and low foreground galaxy density. The principal disadvantage of GGL is 
that its interpretation requires assumptions about the galaxies, which must ultimately be justified 
empirically. 

Galaxy-galaxy lensing can be used in the linear, the weakly nonlinear, and the fully nonlinear 
regimes: 

• Linear regime: In the linear regime, the galaxy-matter cross spectrum is Pgs{k) = bPs{k), 
where b is the galaxy bias factor. Thus C°J(/) is proportional to 6cr|, whereas the galaxy 
power spectrum is proportional to (bas)'^- This provides a way to measure the linear bias of 
the galaxies and hence obtain as- Unfortunately, one must reach very large scales (tens of 
Mpc) for linear perturbation theory to be valid at the few percent level of accuracy, and at 
these scales the signal-to-noise ratio of current GGL results is very low. 

• Weakly nonlinear regime: At scales of order ~ Wh~^ Mpc, nonlinear effects simply represent a 
correction to the linear theory, and one might hope that a judicious combination of observables 
might remove them. The key is to note that when stochasticity between galaxy and matter 
densities is included, the GGL signal is proportional to 6ro"|, where r is the galaxy-matter 
cross-correlation coefficient, so all we require to extract b and ag individually from GGL 
and galaxy clustering observables is a theoretical prediction for the stochasticity. This is a 
convenient result because simulations show that r = 1 is a much better approximation in the 
weakly nonlinear regime than b = constant. This type of analysis is also best done in real 
space rather than Fourier space so that the 1-halo contributions (see §2.3p to both clustering 
and lensing can be eliminated. A specific outline for how to do this, including the next-order 
perturbation t heory corrections to r = 1 and comparison to simulations, is presented by 
Baldauf et aP (|20ld ^. 



Fully nonlinear regime: GGL can be used on the scale of individual haloes {k ~ 1-10 Mpc^^) 
to relate galaxy properties such as luminosity, color, and stellar mass to the properties of the 
host dark matter halo. Such relations cannot be predicted ah initio because of the complicated 
astrophysics involved. Empirical constraints on these relations are useful for dark energy 
studies mostly because they enable us to test some of the underlying assumptions of galaxy 
clustering models. To gain some cosmological power beyond the weakly non-linear regime, 
one can construct full galaxy HOD models and marginalize over their parameters, using both 
GGL and galaxy clustering as constraints ( Yoo et al. . 20061 ). 



Cluster-galaxy lensing is sir nilar to GGL, but one take s clusters of gala xies rather than individual 
galaxies as the reference points ( Mandelbaum et al. . 20061 : Sheldon et al. . 2009 ). We will discuss this 



idea further in ^ arguing that it offers the most reliable route to calibrating cluster mass-observable 
relations and has the potential to sharpen cosmological parameter constraints significantly. 

5.2.7. Method IV: Cosmography* 

The previous sections motivate us to ask whether there is a way to combine the observational 
advantages of GGL with the model independence of the shear power spectrum. There is, although 
there is a large price to pay: one can only obtain geometrical information. 
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The idea is to consider narrow slices of galaxies centered at redshifts < < and measure 
the lensing of galaxies in slices Zjs and z^ by galaxies in the foreground slice Za- The ratio of the 
galaxy-shear cross-spectra is, using equation (j82]) . 



CgEi^) _ COtx DciZg) - cot A- Dcjzp) 
^gli^) cot K DciZa) - cot K Dc{z^) ' 

One can see that all dependence on the power spectra and the distribution of galaxies has been 
cancelled, allowing a purely geometric test of cosmology. T his is called the cosmography or shear- 
ratio test ( Jain and Taylor . 20031 : Bernstein and Jain . 20041 ) . 



One can see from equation ()84p that cosmography can determine the cotj^ Dc{z) relation up 
to any affine transformation, i.e. transformations of the form 

cot_ft: Dc{z) ao + ai coti^- Dc{z), (85) 

which leave the ratios of differences of cot/< Dc{z)s unaffected. (Recall that coti^: Dq = ^/Dc in a 
flat universe.) It is clear that oi is the familiar overall rescaling degeneracy: cosmography measures 
only dimensionless ratios and cannot distinguish two models with different Hg but the same values 
of ^Im, w, etc. Precisely the same degeneracy afflicts the supernova Dl{z) relation because the 
absolute magnitude of a Type la supernova is not known a priori. The oq degeneracy is trickier, 
arising from the fact that oo is not a special distance in lensing problems 1^ Finally, since only 
cot K Dc{z) is measured, cosmography cannot by itself provide a model-independent measurement 
of the curvature of the universe. But aside from these three degeneracies — ai, oq, and K — the 
entire geometry of the universe over the range of redshifts observed is measurable. 

Unfortunately, the aforementioned degeneracies are similar in functional form to the effects 
of Vim and w, and they have severely limited the application of cosmography thus far. This is 
particularly true for observations restricted to low redshift: if one Taylor expands the distance as 
taii-K Dc{z) = ciz + C2z'^ + c^z^ + ... then any cosmological model is degenerate with one that has 
(ci,C2) = (1,0), and hence one must go through at least the term before cosmography provides 
any useful information. For example, at (za,^^,^^) = (0.25,0.35,0.70), the difference in the shear 
ratio (eq. [84l) between an = 0.3 fiat ACDM cos mology and a pure CDM 0^ = 1 cosmology is 
only 1%! In early work ( Mandelbaum et al.l . 20051 ) cosmography was therefore used as a test for 



shear systematics rather than a cosmological probe. 

The outlook for cosmography is much brighter as we probe to larger redshifts, or if we consider 
dark energy models with complicated redshift dependences that cannot be mimicked by the de- 
generacy of equation ()85p . A particularly promising possibility is to use cosmogr aphy with lensing 



of th e anisotropies in the CMB {z = 1100) to obtain a much longer lever arm (jAcquaviva et al. 



20081 ). In principle one can also apply the cosmography method to strong gravitational lenses. Here 
the challenge is that different sources probe different locations in the lens, so one must be able to 
constrain the lens potential extremely well to extract useful cosmographic constraints. 

5.2.8. Method V: Non-Gaussian Statistics* 

The primordial density fluctuations in the universe were very nearly Gaussian, as evidenced 
by the CMB. In this case, the fluctuations are fully described by the power spectrum, and this 
has become the common language of CMB observations. However, nonlinear evolution makes the 
matter fluctuations and hence the lensing shear in the low-redshift universe highly non-Gaussian 



This is the same reason that the "oo" setting on the focus knob for a camera is not special. 
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on small and intermediate scales. Therefore, many other statistical measures of the shear field have 
been proposed, the most popular of which is the bispectrum. 

The bispectrum is obtained by taking the product of three Fourier modes: 



(7S(li)7^(l2)7l(l3)> = {2^?Bl^^^{hMM)5^'\h,hM)- (86) 

Statistical homogeneity forces the three wave vectors involved to sum to zero so the bispectrum 
is actually a function of the triangle configuration; rotational and reflection symmetry then tell 
us that it depends only on the side lengths iU . U.h^^. which must satisfy the triangle inequality. 
Because there are 2 shear modes {E and B), there are actually 4 types of bispectrum: EEE, EEB, 
EBB, and BBB, but only EEE can be produced cosmologically. Limber's equation expresses it 
in terms of the 3-dimensional matter bispectrum, 

B^J^^^ihMM) = [ VFeff,.^efr,/^W^eff, /'^^^^^;!;^^'^'^^^ dDcL (87) 

The bispectrum contains information equivalent to the shear 3-point correlation function. 

The original motivation to study the WL shear bispectrum was to break the degeneracy between 
fim and cTg. At low redshift, and on large scales where perturbation theory applies, the WL power 
spectrum is proportional to r2^cj|, whereas the bispectrum is proportional to r2^cj|; it contains 
three powers of the shear and hence three powers of f^mj but the matter bispectrum is generated by 
nonlinear interactions and is proportional to the square of the matter power spectrum, i.e., to (t| 
rather than cjg. Unfortunately, this route to degeneracy breaking has proven difficult because of the 
low signal-to-noise ratio and high sampling variance of the bispectrum and because the degeneracy 
directions of the power spectrum and bispectrum are almost parallel in the (f^mj cs) plane. A more 
interesting application of the WL bispectrum in future surveys may be as a constraint on modified 
gravity theories, though this has not yet been well studied. 



5.3. The Current State of Play 

Weak lensing as a c osmological probe is only a decade old, although the ideas go back much 
farther. IZwickv (|l937l l famously suggested gravitational lensing as a tool to determine cluster 
masses (although the discussion focused on strong lensing). We separately consider here the more 
recent history of cosmic shear studies, and of galaxy-galaxy lensing as a cosmological probe. Also the 
techniques and applications associated with lensing outside the optical bandpasses are sufficiently 
different that we place them in a separate section. Lensing by clusters is considered in the cluster 
section (^. 



5. 3 .1. Cosm i c she ar 

Kristian ( 19671 ) described an initial attempt to measure statistical cosmic shear using photo- 



graphic plates taken on the Palomar 5 m telescope. He even correctly identified intrinsic alignments 
as a systematic error, and noted that the distance dependence could be used to separate them from 
true cosmic shear. Interestingly, the objective of this analysis was to sea r ch for cosmological-scale 
gravitational waves or other large-scale anisotropies (jKristian and Sachsl . ll966l ). The author set a 



limit on the magnetic part of the Weyl tensoio of < 200-frQ ^, which he describes as "about the 



The EEB and BBB bispectra flip sign under reflections of the triangle, and some convention, e.g. that the sides 
are given in counterclockwise order, must be imposed to avoid ambiguity. 

''^Equivalent to ^ u'^h, where uj is the gravitational wave frequency and h is the strain. 
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best that can be done with this kind of measurement." Fortun ately this has not re mained the case 
- indeed it was improved upon by two orders of mag nitude bv lValdes et~al] jldsi ). 

Th e modern era ofjensing studies was introduced by the availabihty of arrays of large-format 
CCDs. iMould et al.l hm4 ) searched for cosmic shear and reached percent-level se nsitivity, but did 



not detect a signal . Cosmic shear was finally detect ed in 2000 by several groups (jWittman et al. 

2003), and in deeper but narrower data from HST 



2000; 


Bacon et al.. 2000: van Waerbe 


ce et al. 


(Rhodes et al.. 


200 1|; 


Refreeier et al. 




20021. 



Over the same period, several additional square de- 



grees were observed with long exposure times in excellent seeing using ground-based telescopes 
(IVan Waerbeke et al.l . I2OO1I . l2002l : iBacon et al.l . I2OO3I : iHamana et all . I2OO3I I . The first w ide-shallow 
surve ys were also carried out from the ground: the 53 deg^ Red-Sequence Cluster Survey (IHoekstra et al 



2OO2I ) and the 75 deg^ CTIO survey (|jarvis et allbood l. These studies established the existence 
of cosmic shear, but at a level far below that which would be expected in 0^ ~ 1 models nor- 
malized to the CMB. The large error bars in early studies meant that only a single amplitude 
could be measured, yielding a constraint on the combination cr8(r2m/0.3)'^, where the exponent v 
varied between 0.3 and 0.7 depending on the scale and depth. In the first de t ection of the cosmic 
shear bispectrum, achieved with the VIRMOS-DESCART survey, IPen et al.l i{200i ) measured the 
skewness of the filtered shear signal and used it in combination with the power spectrum to rule 
out large-Om; low-erg solutions, finding $7^ < 0.5 at 90% confi dence. The deep CO MB 0-1 7 survey 



first detected the evolution of erg as a function of cosmic time (| Bacon et al.l . l2005l ) 



However, the early studies of cosmic shear were not free of trouble. As one can see from 
Table m while most were broadly in agreement with as in the 0.7-0.9 range, a detailed comparison 
shows that the measurements were not all consistent. This discrepancy stimulated discussions 
about a number of possible ancillary issues with the data, such as the role of intrinsic alignments, 
whether the source redshift distribution N(z) was properly calibrated, and whether the models for 
the nonlinear power spectrum and assumptions about the shape parameter T could be leading to 
discrepancies. More seriously, most of the early measurements contained B-mode signals at levels 
not far below the E-mode. This was a clear signal of contamination of non-cosmological origin, 
probably PSF correction residuals. Also, intrinsic alignments of galaxies were detected at high 
significance even in the linear regim e, at a level that represen ted a potentially serious systematic 
error even for then-ongoing surveys (jMandelbaum et al.l . lioOfil V 



It was clear by 2006 that weak lensing was a very hard observational problem and that a great 
deal of work lay ahead to turn it into a precision cosmological probe. This resulted in a reduction in 
the rate of new cosmic shear results, the reorganization of the field into larger teams, and detailed 
looks at systematic errors ranging from optical distortions in telescopes to intrinsic galaxy align- 
ments. Several wide-field optical surveys were ongoing at the time, including the deep 170 deg^ 
CFHT Legacy Survey (for which cosmic shear was a key science driver) and the ve ry deep multi 



wavelength COSMOS survey w ith high-resolution optical imaging from HS T/ ACS (IMassev et al. 



Semboloni et al 



2006 



2007al: ISchrabback et al.l. I2OIOI). The C FHTLS presented some early results (|Hoekstra et al.l . I2OO6 



Fu et al.l . I2OO8I ). but following this there was a rather bleak period of time. 



No new ground-based wide-field cosmic shear results were published, and no new large surveys were 
undertaken with HST, nor do future HST weak lensing surveys seem likelyil 

In the past five years, however, great progress has been made in overcoming the difficulties 



^*The premier lensing instrument on the HST (the Advanced Camera for Surveys) failed in January 2007. While 
its wide-field channel was restored during the 2009 servicing mission, the sky coverage possible with ACS is not 
competitive with next-generation ground-based surveys and it seems unlikely a major cosmic shear program will be 
undertaken with HST. Rather, the next major step in space-based cosmic shear will likely be the dedicated EucUd 
mission planned for 2019. 
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Table 3 A summary of cosmic shear results from the literature obtained in the optical. Note that 
some of these results are independent analyses or extensions of previous data sets and hence are 
not independent. 



Bacon et al. f 20001 



^ai^Waerbeke_e^^ (2000) 
Wittman et al. (20001 
Rhodes et al. (2001) 
Van Waerbeke et al. (20011 



JJoekstra^^_al^ (2002) 
Refregier et al. (2002 ) 



Bacon et al. (2003 1 
Brown et al. (2003 ) 
^arvis_e^^ (2003 ) 
HamanaetaL (2003 ) 
Rhodes et aL_ (20041 
^cymans ct^^l^ (2005) 
Masscv ct al. (2005) 
Hoekstra et ain T^SoO) 
^embolonT^TTl . (2^0 6 ) 



JJcniamin ct al. (^007) 
Ticttcrschcidt ct ^"^007) 
^asscy ct aL (2007a) 
_Schrabback^^_al^ (2007) 

Fu ct al. (2008) 

Schrabback et al. (2010) 
Jluff ct al. (2011 ) 

Lin ct al. (2011) 



Telescope/instrument 


Area 


Number of 


Result 




(dcg2) 


galaxies 






WHT/EEV-CCD 


0.5 


27k 




= 1.5 ± 0.5 (Q fim = 0.3) 


CFHT/UH8K+CFH12K 


1.75 


150k 


Detection" 


Blanco/BTC 


1.5 


146k 


Detection 


HST/WFPC2 


0.05 


4k 


a8(n„/0.3)0-« =o.9il°;^= 


CFHT/CFH12K 


6.6 


400k 


<Ts(n„/0.3)''-<* =0.99j:°;5o (95%CL)'^ 


CFHT/CFH12K + Blanco/Mosaic II 


53 


1.78M 




{n„/0.3)''-^^ = 0.87^0,11 {95%CL) 


HST/WFPC2 


0.36 


31k 




= 0.94 ± 0.14 ((0 n„ = 0.3, r = 0.21) 


Keck II/ESI + WHT 


1.6 






(nm/0.3)°-'^^ = 0.97 ± 0.13 


MPG ESO 2.2m/WFI 


1.25 




o"6 


(n„/0.3)'' *'' = 0.72 ± 0.09'*''' 


iDianco/ 15 J. <../-t-iviosaic ii 






0"6 


(n™/o.3)0-" = o.7ilO;;^ {2a) 


Subaru / SuprimcCam 


2.1 


250k 


0"6 


(n™/0.3)''-^'' = O.TSt° ll (95%CL) 


HST/STIS 


0.25 


26k 




(nm/0.3)''-'"'(r/0.21)''-*** = 1.02 ± 0.16 


HST/ACS 


0.22 


60k 




{n„/0.3)''-''^ = 0.68 ± 0.13 


WHT/PFIC 


4 


200k 




(nm/0.3)°-^ = 1.02 ± 0.15 


CFHT/McgaCam 


22 


1.6M 




= 0.85 ± 0.06 @ flm = 0.3 


CFHT/McgaCam 


3 


150k 




= 0.89 ± 0.06 a n,n = 0.3 


Various^ 


100 


4.5M 


<^6 


(n„/0.3)''-^'' = 0.74 ± 0.04 


MPG ESO 2.2m/WFI 


15 


700k 


0"6 


= 0.80 ± 0.10 a Q,„ = 0.3 


HST/ACS 


1.64 


200k 




(n„/0.3)0-" = 0.866t''o°ll 


HST/ACS 


0.4 


100k 




= 0.52+°;i5(stat)±0.07(sys) & Q,„ = 0.3* 


CFHT/McgaCam 


67 


1.7M 




(n„/0.3)'^-''* = 0.70 ± 0.04 


HST/ACS 


1.64 


195k 


<T8(n„/0.3)"-^* = 0.75 ± 0.08 


SDSS 


168 


1.3M 




= 0.636+0 a f2„ = 0.265*' 


SDSS 


275 


4.5M 


<T8(n„/0.3)0-^ =0.64+0;0»" 



'^'Consistent with — 0.3 (A or open), cluster normalized; 0.rn — 1; '''8 ~ ^ excluded. 

^Consistent with ACDM or OCDM, but not COBE normalized Q.m ^ 1- 

•^Reanalysis by Van Waerbeke et al. (2002 ) gives erg - 0.98 ± 0.06 (H^ ^ 0.3, V ^ 0.2, 68%CL). 
"^Reanalysis by I^^man^^^^^ ^00^) to correct for intrinsic alignments gives (Tg (n^n /O.S)"^'® = 0.67 i 0. 10. 
Brown et al. (2005) used a subset of this data to show that the matter power spectrum increased with time. 
In the Chandra Deep-Field South; the authors warn that this field was selected to be empty, hence erg may be biased low. 
^A combination of 4 previously published surveys. 

^Both based on the same raw SDSS data, but with analyses and reduction pipelines by 2 different groups. 



that at first appeared so daunting. The community made a massive investment in algorithms to 
determine and correct for PSF ellipticities (we will review some of these in ^5.5p . and in investigat- 
i ng the physics that de termines the PSF, including such complications as atmospheric turbulence 
(jHevmans et al.l. l201ll'). Equal l y important, these methods were tested in p ublic challenges on 



simulated data ( Heymans et al. . 2006 : Massey et al. . 2007b : Bridle et al. . 20ld ). Progress was also 
made on astrophysical systematic erro rs. We learned tha t large-scale intrinsic gala xy alignments 



are strongest for luminous red galaxies (iHirata et al.l . 120071 : iMandelbaum et al.l. l201lh 



linear alignment model, once considered a crude analytical tool (jCatelan et al. 



2001 



and that the 
, is in fact an 



excel lent description of the observations of early-type galaxies at > lO/i ^ Mpc scales (IBlazek et al 



20111). 



As a result of this great effort by the community, the Stage II weak lensing results are finally 
coming to fruition and yielding large data s ets that pass th e standard systeni atics tests (e.g., B- 
modes consistent with zero). Two groups ( Lin et al. . 2011 : Huff et al. . 201ll ) have performed a 
cosmic shear measurement using the Sloan Digital Sky Survey deep co-added region — a 120- 
degree long stripe observed many times over the course of three years as part of the SDSS-II 
supernova survey. These analyses used different methods to co-add their data and correct for 
the PSF ellipticity, and they imposed different selection cuts and hence had different redshift 
distributions, yet the results were in agreement (and slightly more than \a below the WMAP 
prediction for cxg). The analysis of the complete CFHT Legacy Survey data is nearing completion 
and has passed the key systematics tests; these results are eagerly anticipated as the next major 
step for cosmic shear. 

A summary of the current status of optical cosmic shear results is shown in Table [3j 
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5.3.2. Galaxy-galaxy lensing as a cosmological probe 

Like cosmic shear, galaxy-galaxy lens ing is an old idea. The earliest astrophysically interesting 
upper limit was that of lTyson et alJ (119841 ) . who used the images of 200,000 galaxies measured by the 
now-obsolete method of digitizing photographic plates to exclude extended isothermal haloes with 
Vc > 200 km s~^ around an apparent magnitud e-limited sample of galaxies. Galaxy- galaxy lensing 



was observed at ~ Aa by Brainerd et al. ( 19961 ) , the first clear detection of cosmological weak lens- 
ing. Their analysis used a total of 3202 lens-source pairs in a fiel d of area 0.025 deg^ . Several other 



2001; Hoekstra et al 



1998: Smith et al. 



detections followed th i s in d eep surveys with limited sky coverage (jHudson et al 

2003). However the full scientific exploitation of the galaxy-galaxy lensing 



signal — in contrast to cosmic shear — favors wide-shallow surveys over deep-narrow surveys since 



Therefore, in 



1 /2 

the S/N in the shape-noise limited regime scales as only resource rather than fisouice- 
the decade of the 2000 s the leading galaxy-galaxy lensing surveys became th e 92 deg^ Red-Sequence 
Cluster Survey fRCS: iHoekstra et allliool 120051 : iKleinheinrich et al.ll20nd ) and eventually the 10^ 
deg^ SDSS (references below). The availability of spectroscopic redshifts in the latter allowed the 
signal from low-redshift galaxies to be stacked in physical rather than angular coordinates, en- 
abling the detection of features as a function of transverse separation. The spectroscopic survey 
also provided detailed environmental information, measures of star-formation history, and full 3- 
dimensional clustering data (e.g., correlation lengths and redshift-space distortions) for the lens 
galaxies. 

The SDSS remains the premier galaxy-galaxy lensing survey today, for both galaxy evolution 
and cosmology applications, and it likely will remain so until DES and HSC results become available. 
The SDSS Early Data Release, comprising only a few per cent of the overall survey, already detecte d 
the galaxy- galaxy lensing signal with high significance ( Fischer et al. . 2000 : McKay et al.l . I2OO1I ). 
Some of the major results of cosmological importance from the SDSS galaxy-galaxy lensing program 
have been: 

• The galaxy bias can be constrained directly by going to the very largest scales and measuring 
galaxy-galaxy lensing in the 2-hal o regime. By dividing the galaxy clustering signal by the 
galaxy-mass correlation function, Sheldon et al. ( 20041 ) found for galaxies a bias of 6 = 
(1.3 ± 0.2)(fim/0.27)r, where r is the stochasticity (presumably ~ 1 at the largest scales), 
with no evidence of scale dependence l^^i^ 



The measurement of halo masses — or more accurately, HOD parameters (see ^2.3p — with 
galaxy-galaxy lensing also enables one to predict the galaxy bias, by using the bias-mass 
relation b[M). One can in principle use this to constrain cosmological parameters since the 
clustering of the lens galaxies can be measured and hence one can obtain as gai ^ ha^. The 
result s of this analysis on 300,000 lens galaxies at z ~ 0.1 were presented in ISeljak et al. 
(j2005l ). The direct constraints on ug itself (with other parameters fixed) were uninteresting 
because the bias at fixed halo mass is a decreasing function of erg , and the observable product 
(Tgft is almost independent of fig. However, cosmological parameters that change the shape of 
the power spectrum can be constrained quite well — e.g., a decrease in small-scale power can 
make halos rarer and hence decrease b without a compensating change in cig. This breaks 



degen eracies internal to the CMB alone. Combining with first-year WMAP data, ISeljak et al 



(|2005l ) found that for the case of three degenerate neutrinos one must have ^ niy < 0.54 eV 
(95%CL). 



^Since lensing measures 5p rather than 5p/ p, there is a factor of in this measurement. 

°A re-analysis with the final SDSS imaging data set and improved treatment of the stochasticity is underway. 
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The halo mass-concentration relation c(M) (e.g., iBullock et al.l 120011 ) is not in and of itself 
especially useful as a dark energy probe; it depends somewhat on Um, but also on baryonic 
physics. Nevertheless, testing it is important for any co smological application of the 1-halo 
regime, including cosmic shear ( King and Meadl . 201 ll ). and galaxy-galaxy lensing is well 
suited to measuring it at a range of halo masses. ( For clusters other te c hniqu es are available, 



Mandelbaum et aP (j2008l ) measured this 



such as strong lensing or X-ray measurements. 

relation across the lO^^ _ iqI^Mq range, finding C2oofe(M) = (4.6 ± 0.7)Mj^°-^^^°-°'^, where M 
is the halo mass in units of W^'^h~^ Mq. The normalization is about 2a below the theoretical 
predictions, but the discrepancy may well be a statistical accident, particularly given that 
other methods have led to larger concentrations. 



Reves et al.l (|20l3) tested GR by comparing the galaxy-mass correlation function, measured 
via weak lensing, to the galaxy-velocity correlation function, measured via redshift-space 
distortions. The SDSS luminous red galaxy sample was chosen due to its large volume. This 
measurement requires an overlapping spectroscopic and WL survey. They find that 



Eg 



0.39 ± 0.06, 



where T is a filtered correlation function and f3 is the redshift-space distortion parameter. 
The combination Eg is equal to i^m/f at the redshift of the lenses, for which GR predicts 
^),':^^{z = 0.32) = 0.408 it 0.029. This measurement establishes that the peculiar velocities 
of galaxies are, to ~ 15% precision, in agreement with expectations based on the potential 
structure traced by lensing. 

All of these measurements will become possible with much smaller error bars once the Stage III 
WL experiments are operational. We look forward in particular to much smaller error bars on b/r 
and Eg derived from the largest scales, as well as improvements on c(M). 

5.3.3. Lensing outside the optical hands 

All wavelengths of light are gravitationally lensed. The optical^ is not special in this regard — 
rather, the emphasis on optical wavelengths has been technological, as this is the cheapest band in 
which to observe and resolve large numbers of galaxies at cosmological distances and obtain some 
redshift information. However, advances in technology in other wavebands have resulted in weak 
lensing being detected at several other wavelengths: 

• In the radio, kilometer-scale interferometers are required to resolve extragalactic sources, and 
at the present tim e one cannot obtain a radio photo-z because of the featureless synchrotron 
spectra. However, Chang et al. ( 20041 ) detected cosmic shear of extended radio sources using 
the Very Large Array FIRST survey. 

• Lensing of the CMB has been of interest for some time as it provides the most distant possible 
source screen. The first search was carried out in cross-correlation by iHirata et al.l (|2004al ) 
using luminous red galaxies in SDSS as the lenses and WMAP temperature anisotropies as the 
sources. The signal was detected three year s later with combinations of SPSS and NVSS data, 
and two additional years of WMAP data ( Smith et al. . 2007 : Hirata et al. . 20081 ) . Recently, 



By "optical," we mean to include near-infrared wavelengths A > 0.7 /im at which stars are still the dominant 
source of luminosity, and which are observed through traditional optical telescopes and with detector technology 
based on the creation of electron-hole pairs in semiconductors. 
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the Atacama Cosmology Telescope (ACT) carried out a c osmic shea r auto correlation analysis 
using the CMB as a source and detected the signal at 4a ( Das et all . bo 111 ). While apparently 



weak, thi s measurement shows that Q\ > using CMB data alone, without assuming a flat 
universe ( Sherwin et al. . 201 ll ). The ACT and South Pole Telescope (SPT) collaborations are 
next planning polarization surveys, which should yield much higher S/N detections of lensing 
and provide constraints on the neutrino mass. 

5.4- Observational Considerations and Survey Design 
5. 4-1- Statistical Errors 

The forecasting of statistical errors on the cosmological parameters is much more involved for 
WL than for supernovae or BAO because of the complex dependence of the observables on the 
underlying model. Nevertheless, some intuition can be gained by making approximations to enable 
exact evaluation of the integrals. Specifically, we assume (i) a single source redshift Zs] (ii) a 
power-law matter power spectrum, 

Ps{k, z) = 4.2 X lQ-^alHQ^-'^G'^{z)k-^-^ , (89) 

where the slope k~^'^ is chosen to match that of the ACDM power spectrum at scales of tens of 
Mpc and the normalization is chosen to give the correct ug; (iii) evaluation of the normalization 
(1 + z)G{z) not at the true lens redshift z\ (over which we integrate from to z\) but at a "typical" 
lens redshift Zs/2\ and (iv) a flat universe. Then equation ()72p gives 



CBi5(/) = 1.1 X 10-Vi (l + ^)G(^) ^i,[HMzsT'l-'-''- (90) 



Z<,\ „ (Z, 



2 1 V 2 



The variance per logarithmic range in / is 



2 



^^s \ ^ (Z, 



A^(/) ^ — Ci,i,(0 = 1.8 X lO-Vg^ + n'^{H^Dc[z,)Y-''f-'- (91) 



2 1 V 2 



this is a measure of the shear variance at a particular angular scale Q ~ l^^ . 

In practice, equation (f9T]l is only a rough guide because of deviations of Psik) from a power 
law and the nonlinear enhancement of the matter power spectrum on small scales. Nevertheless, 
we can see several important features: 



1. The typical shear, given by ^ A2(/), is of order 1% at cosmological distances (zg ~ 1) and 
degree scales (Z ~ 100). The shear fluctuations are larger at smaller scales. 

2. The shear power spectrum scales as tx o"|. Assuming a known background cosmology and 
source redshift, a measurement of the power spectrum to X% determines cjg to an uncertainty 
of ^XVo. (In practice, in the nonlinear regime, the dependence of the shear power spectrum 
is closer to erg and the constraint on cig is better than equation [91] would suggest.) 

3. Alternatively, if one assumes the standard model for the growth of structure, then the distance 
Dci^s) to the sources can be determined to an uncertainty of y^XVo. Lensing thus acts as a 
standard "ruler." 

4. Measuring the shear power spectrum as a function of source redshift Zs allows one to measure 
some combination of the growth function and the distance as functions of redshift. However, 
one does not measure both simultaneously. In order to simultaneously constrain the functional 
forms G{z) and Dc{z), lensing must be combined with another cosmological probe. 
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5. Systematic errors in any of the terms in equation (j9ip will bias the cosmology results. In 
particular, a 1% change in Zg, e.g. 1.00—)- 1.01, changes the power spectrum by 2%. Therefore, 
careful estimation of the source redshift distribution is required for a WL survey — a challenge 
when relying on photometric redshifts for the vast majority of sources. 

The statistical uncertainty on the shear power spectrum is determined by two factors: sampling 
variance at low I and shape noise at high I. Sampling variance uncertainty is associated with the 
fact that there are only a finite number N of Fourier modes in the survey area, and consequently the 
fractional uncertainty in the power (variance of 71) can be no smaller than y^2/N. If we measure 
the power spectrum in a bin of width Al, then the number of modes is = 2ZA//sky, where /sky 
is the fraction of the sky observed. This corresponds to a sampling variance uncertainty 

Cee{1) y^/AZ/sky 

If we measure modes up to some /max, there are ^max/sky modes, and the sampling variance uncer- 
tainty in the normalization of the power spectrum is \/2/gj.y''^/max- 

At high /, the errors on the WL power spectrum become dominated not by the number of modes 
available but by how well each mode can be measured with a finite number of galaxies. Individual 
galaxies are not round, and so a shear estimator applied to a galaxy has an intrinsic scatter ~ 0.2 
rms in each component of shear (7+ or 7x). This phenomenon is known as shape noise. Since it is 
uncorrelated between distinct galaxies (at least as a first approximation), shape noise produces a 
white noise (^independent) power spectrunj^. 

2 

CtrH) = ^> (93) 

where ngff is the effective number of galaxies per steradian (this is the true number of galaxies with 
a penalty applied for objects where the observational measurement error on the shear becomes 
comparable to cj^; see below). Since the cosmic shear Cee{1) is decreasing with I, there is a transi- 
tion scale Itr where the shape noise becomes comparable to the lensing signal. Using equation (j9ip . 
we estimate 



Itr = 1300 — 
\0.8 



as N 1-54 



2 J V 2 



1.54 /O \ 1-^4 / fj a: \ 0-77 

I Urn \ rrr 7~i ^ Ml. 77 / '*eff \ 



yj.6 J \2U arcmm / 



At angular scales smaller than ^ ~ Z^J^, , lensing cannot measure the typical fluctuations in the 
density fieldll^ Statistical measurements are still possible, however, and the power spectrum can 
be measured to an accuracy of ^/2/N C^^^'^{1) where N is the number of modes. Thus 

a[CEE{l)]_ 1 CtTH) _ 1 (95) 



Cee{1) ^IM /sky Cee{1) y/lAl /sky V ^tr 

One can see from this equation that the fractional uncertainty on Cee{1) in bins of width Al/l ~ 1 
increases with / for / > Ztr- Therefore we arrive at the important conclusion that the power 



^■^We give the E-mode noise here. There is an equal amount of shape noise power in the _B-mode, but the lensing 
B-mode is used only as a systematics test because it contains no cosmological signal to first order. 
''^High-amplitude features such as clusters may still be visible. 
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spectrum is best measured at the transition scale /tr^ on larger scales sampling variance degrades 
the measurement even though individual structures are seen at high signal-to- noise ratio (SNR), 
and on smaller scales shape noise dominates. The aggregate uncertainty in the normalization of 
the power spectrum is thus of order 

crfnormalization) 1 

r- '■ ~ . (96) 

normalization ^tr\//sky 

An full-sky experimen10 reaching tens of galaxies per arcmin^ at redshifts of order unity would 
have Itr ~ 1000 and so could measure the normalization of the power spectrum to a statistical 
precision of order 0.1%. This would be an unprecedented measurement of the strength of matter 
clustering. However, as we will see below, there are substantial statistical and systematic hurdles 
to such an experiment. 

Finally, we consider galaxies measured at finite SNR. In the above analysis, we assumed that 
each galaxy provided an estimate of the shear with uncertainty a-y. At finite SNR there is also 



measurement noise o"obs) so that each galaxy provides an estimate with error ya^ + '^obs' Using 
inverse- variance weighting, in the finite-SNR case the shape noise becomes equation (|93p . with the 
effective source density 

1 

^eg = lE ^2,;2 ^ (97) 

where A is the survey area and the sum is over the galaxies. This is always less than n = Ngi^i/A. 
The effective source density ngfr is limited in part by the depth of the survey: Cobs.i typically scales 
with integration time as oc but once aobs,i ^ cr-y one no longer continues togain. How long 

does this take? In ^5.5.31 we will show that for nearly circular, Gaussian galaxieO 



CTobs = - 1 + ^ , (98) 




where rpgf and Tgai are the half-light radii of the PSF and the galaxy, respectively, and u is the 
detection significance (in as). Thus for galaxies with a similar size as the PSF, we expect to reach 
fobs = 0.1 (measurement noise half of shape noise) after integrating long enough to see the galaxy 
at 20cr. 

In principle, the summation in equation ([97|) is over all objects detected as extended sources, and 
any galaxy could be used if its detection significance is high enough. In practice, this is dangerous: 
while one might hope to obtain (Jobs = 0-1 oii a galaxy with rgai = O.Srpsf and a 50a detection, 
the "ellipticity measurement" on this galaxy consists of measuring the small deviation of the image 
from the PSF. Such a procedure tends to magnify systematic errors in the PSF model and is usually 
unadvisable. Therefore, most WL surveys impose a cutoff on rgai/rpgf or some similar property. 

5.4-2. The Galaxy Population for Optical Surveys 

The design of a WL survey must begin by considering the population of galaxies. We will focus 
here on the population in the 3-dimensional space of redshift z, effective radius , and apparent 
AB magnitude in the /-band (a convenient choice for shape measurement with red-sensitive CCDs 



In practice the Galactic Plane must be avoided, and it is unlikely that optical astronomy would push beyond 
/sky ~ 0.7 for any cosmological application. 

^^For realistic non-Gaussian profiles, the shape measurement error is usually worse by of order 20%. 
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from the ground). The plots shown here are based on the mock catalog of iJouvel et al. I (I2OO9I ') 



which uses real galaxies from the COSMOS survey but fills in missing information for individual 
galaxies (e.g. redshifts or line fluxes) with photo-zs and models. 

Figure [16] shows the mean surface density of galaxies and the median source redshift as a 
function of limiting magnitude /ab for effective radius cuts of 0.15", 0.248", and 0.35". In general, 
one would like to use galaxies larger than the PSF to avoid amplification of systematics when 
applying a PSF correction to the shapes. The "effective radius" (EE50, for 50% encircled energy) 
of a typical ground-based PSF is ~ 0.35" under good conditions, corresponding to a FWHM of 
~ 0.7". The 0.248" cut is a factor of \pl smaller, appropriate if one can make use of galaxies smaller 
than the PSF (ctj = ^0"^) or has sufficient etendue to do the entire survey under the very best seeing 
conditions. Measuring galaxies at rgff = 0.15" is well beyond present ground-based cosmic shear 
survey capabilities, for both algorithmic and PSF-determination reasons, and will likely require a 
space (or balloon) based platform. 

5.4-3. Photometric Redshifts and their Calibration 

Modern WL analyses all use photometric redshifts in some way. They are central to tomography 
and cosmography measurements, and they are also needed in most schemes to remove the intrinsic 
alignment contamination. In the case of GGL, photo-zs are used to select sources that are actually 
behind the lens plane (sources in front of the lens are unlensed and dilute the signal, whereas 
sources at the same redshift as the lens can contribute intrinsic alignments). 

One can characterize the photo-z distribution using the joint probability distribution for the 
photo-z Zp and the true redshift z for some sample of galaxies, P{zp,z). In the case of lensing, 
we care about the conditional probability distribution, P[z\zp). This distribution is sometimes 
characterized by its conditional bias and scatter, 



5z{zp) = zp-{z)l^, a,{zp) = ^{z^)U^ - {z)\l , (99) 

but it is always non-Gaussian and in practice there are "outliers" or "catastrophic failures" with 
\z — Zp\ ~ 0{1). The conditional probability distribution is not symmetric: Bayes's theorem tells 
us that 

P(^\^p) = ^^(^pN)' (100) 

so a photo-z that is is "unbiased" in the conventional sense of {zp)\z = z may still have 5z{zp) 7^ 0. 

If the full distribution P(z\zp) is known, then the shear cross-power spectra for any pair of 
redshift slices can be determined for a given cosmological model. However, the use of photo-zs 
to suppress intrinsic alignments ( ^5.6. ip does not work if the intrinsic alignments of the outliers 
are significant, or even if the scatter is large enough that galaxies can evolve significantly within a 
redshift bin, so there is a strong motivation to reduce them to the minimum level possible. Thus 
lensing programs must face two challenging problems: (i) obtaining a low outlier rate, and (ii) 
determining P{z\zp) to sub-percent precision. 

To understand how to reduce the outlier rate, we must investigate how photo-zs work: they take 
several broad-band fiuxes from a galaxy and try to identify spectral features. At low redshifts, the 
strongest feature in the optical part of a galaxy spectrum is the break around 3800-4000A, arising 
from metal line absorption in early-type galaxies and the Balmer continuum (plus high-order lines) 
in late-type galaxies. As the redshift of the galaxy increases, this feature moves to the red, and 
above redshifts of z ~ 1.3 it is no longer useful for optical photo-zs (depending on the SNR in z 
and y bands). At z > 2, the Lya break redshifts into the optical bands and can be used - but it 
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21 21.5 22 22.5 23 23.5 24 24.5 25 25.5 
Limiting magnitude, I^b 

Figure 16 The mean surface density of galaxies (top panel) and median redshift (bottom panel) 
as a function of limiting magnitude. The three curves show different cuts: the top curve is a 
cut at 0.15", which might be applied to a space-based survey; the middle curve is a cut at 0.248", 
which would be an optimistic choice from the ground; and the bottom curve is a cut at 0.35", a 
more conservative choice for a ground-based survey with ~ 0.7" seeing (FWHM). For galaxy-galaxy 
lensing, one could make more aggressive cuts. 
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Figure 17 The SEDs of three stehar populations are shown: a single burst at age 25 Myr (top); a 
continuous star- forming population of 6 Gyr age (middle); and a single burst at 11 Gyr (bottom). 
All have solar metallicity. Blueward of Lyg they have b een a djusted for an IGM transmission factor 



of 0.8 (appropriate for z = 2.25; see iMcDonald et al.l (j2006)) but other corrections (du st, nebular 



emission) are not included. The models are obtained from lBruzual and CharlotI (j2003l ). Note the 
break at ~ 0.37 — 0.40 /um present in all models (albeit with varying shape, strength, and precise 
location). 



is possible to confuse it with the Balmer/4000A break. This is the principal example of a photo- z 
degeneracy. 

The above discussion suggests that to reduce outliers across the whole range of redshifts used 
for WL surveys (2 = to ~ 3), one desires coverage from blueward of the Balmer/4000A feature 
(i.e. a ti-band) through the near-IR {J + H bands), so that either the Balmer/4000A feature or Lya 
is robustly identifiable. The optical bands can be easily observed from the ground. As one moves 
redward, however, the sky brightness as observed from the ground increases rapidly, and obtaining 
the J + H band photometry matched to the depth of future surveys is only practical from space. 

One is then left with the problem of measuring the photo- 2: error distribution. Conceptually 
the simplest way to do this is to collect spectroscopic redshifts of a representative subsample of the 
sources used for WL. This is, however, very expensive in terms of telescope time: many galaxies 
have weak or absent emission lines (particularly if one restricts to the optical range), and so one 
searches for absorption features of faint (i ~ 22—25) galaxies. Stage III/IV experiments may require 
0(10^) redshifts, and we desire sub-percent failure rates because the failures are likely concentrated 
at specific redshifts. These failure rates are far below those that have actually been achieved by 
spectroscopic surveys at the desired magnitudes. An alternative idea is to make use of the fact 
that a subset of these galaxies have strong emission lines, which makes obtaining spectroscopic 
redshifts much easier. Then one could use the 2-D angular cross-correlation of these spectroscopic 
galaxies with the main photo-z sample, as the correlation between photometric g alaxies at photo - 



Z Zr 



and spectroscopic galaxies at redshift z should be proportional to P{z\zp) (Newman, 20081 ) . 
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However, it also depends on the bias of the photo-z galaxies: one can show that the cross-correlation 
method actually constrains the combination b{z, Zp)P{z\zp). Of course, probability distributions 
must integrate to unity, / P{z\zp)dz = 1, but if the bias varies with photo-z error (e.g., because 
high-bias red galaxies and low-bias blue g alaxies have different ph o to- 2: e rror distributions) then 
this must be modeled to extract P{z\zp) ( Matthews and Newman . 20ld ). The cross-correlation 



technique has to date not been used for WL surveys, but it h as been u s ed to measure other redshift 
distributions — see, e.g., the appUcation to radio galaxies byi^ ». Overall, the problem 



of measuring P{z\zp) to the required accuracy remains one of the greatest challenges for future WL 
projects. 

5.4-4- Lensing in the Radio 

An interesting alternative to shape measurement in the optical is to work in the radio part of 
the spectrum, where late-type galaxies are observable via their synchrotron emission. In order to 
achieve the required resolution, one needs to use a large interferometer: a fringe spacing of 1" is 
achievable at 1 GHz with a baseline of 60 km. One also needs a large collecting area to obtain 
high-SNR i mages on a compet itive number of galaxies; the SKA could in principle measure billions 
of galaxies ( Blake et al. . 20041 ) . But let us suppose such an interferometer were built. What would 



it do for WL? In principle, it could solve many problems at once: 

• Shape measurement: An interferometer directly measures the Fourier transform of the surface 
brightness of a galaxy, /(u), thereby avoiding the difficulty of interpolating PSF properties 
from stars. On long baselines, the u-plane is usually sparsely sampled, i.e., not all values of u 
are observed; the Fourier mode u = L_l/A, where L_l is the interferometer baseline projected 
into the plane of the sky. At a given wavelength, as the Earth rotates, each baseline thus 
traces out an ellipse in the u-plane. However, if one combines a finite range of A and many 
baselines, one could fill in the u-plane, and model-fitting shape measurement techniques can 
work even with significant coverage gaps . Mode l - fittin g methods were used in the analysis 
of the FIRST survey at A = 20 cm ( Chang et al. . 20041 ) . which resulted in a 3a detection of 



cosmic shear. 

• Redshifts: Late-type galaxies contain atomic gas, and thus radiate in the H i 21 cm line. For 
nearby galaxies (z < 0.1) this line has a long history of being used as a redshift indicator. 
A Stage IV radio interferometer survey could collect hundreds of millions of spectroscopic 
redshifts in this line out to z ~ 1 — 2, thereby obviating the need to rely on photo-zs and 
calibrate photo-z error distributions. Conversely, it is not clear that one could make use of 
the many radio galaxies not detected in H i, as even photometric redshifts for large radio 
galaxy samples are difficult to obtain at high completeness. 

• Shape noise reduction: The radio part of the spectrum offers interesting opportunities to 
reduce shape noise. For example, if one spatially resolved the H i disk of a galaxy, one 
could produce a velocity map. A perfect inclined disk has the long axis aligned with the 
velocity gradient, and if sheared this alignment is destroyed, so a measurement of th e velocit; 



gradi ent provides independent information on the intrinsic shape of the galaxy (jMorale; 



I 



20061 ). Another idea is to use the polarization of the synchrotron emission, which tends to 



be perpendicular to the galactic disk and hen ce is an indicator of the position angle of the 
intrinsic minor axis (|Brown and Battvel . I2OI0I ). While promising, these ideas are new, and 



their practical application to a WL survey may have to await at least a partial SKA. 
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5.4-5. Lensing of the CMB 

It is also possible to do lensing analyses on the CMB. Here there are several advantages: the 
source redshift is known exactly from cosmological parameters, Zgrc = 1100; theory predicts exactly 
the statistical distribution of hot and cold spots on the CMB, so there is no intrinsic alignment 
effect; and the PSF (or "beam" shape) of microwave experiments tends to be far more stable 
than in the optical. The CMB is a diffuse field rather than a collection of objects (galaxies), so 
reconstructing the shear requires a different mathematical formalism than for galaxy lensing. The 
basis for this formalism is two-fold: 

• In the presence of lensing by a Fourier mode of the potential ^(1), the CMB anisotropy field 
is no longer statistically isotropic: different temperature Fourier modes become correlated, 
(T*(li)T(l2)) oc -0(12 — li ). These products of temperature modes can be used as estimators 
of the lensing potential Hu ( 200ll ). 



• A more dramatic effect occurs for CMB polarization. The unlensed CMB polarization is pure 
£'-mode@ i.e., the polarization in each Fourier mode is parallel to the wavevector rather 
than at a 45° angle. Lensing shear changes the direction of the wavevector 1 but not the 
polarization, so it can generate B-m.ode shear. 

Until recently, because of SNR is sues, lensing of the CMB had been de tected only in cross- 
correlation with foreground galaxies ( Smith et al. . 2007 : Hirata et al. . 20081 ) . The advent of the 



arcminute-scale CMB experiments (primarily motivated by cluster cosmology using the SZ effec t) 



has enabled robust detections of the power spectrum of the CMB lensing field (|Das et al.l . |2011| ). 

Because CMB lensing only provides a single source slice, it is unlikely to ever replace galaxy 
lensing. Howeve r, in comb ination with galaxy lensing, it can provide the most distant source slice 
for tomography (Hu, 20021 ) and cosmography ( Acquaviva et al. . 20081 ). 



5.5. Measuring Shears 

So far we have treated shear measurement as a black box: it takes in an image of the galaxy and 
some knowledge of the instrument, and it returns 7+,X) an unbiased estimator for the true shear 
7 with some uncertainty per component a^. This black box is very complicated on the inside, as 
one needs an accurate and robust shape measurement algorithm, and even providing the necessary 
inputs to such an algorithm, particularly an accurate determination of the PSF, has proven to be 
difficult. After a brief overview of these algorithms, we describe the idealized problem of measuring 
shear from an ensemble of galaxy images, then turn to a more detailed discussion of the challenges 
that arise in practice. 

There are two general strategies for shape measurement methods in common use today. One 
class of methods is to measure moments of galaxies (in real or Fourier space), and relate, e.g., 
the mean quadrupole moment of galaxies to the shear. These methods started with ad hoc "PSF 
correction" prescriptions, but they have recently evolved toward methods that attempt to statisti- 
cally close the hierarchy of moments of galaxies and PSFs in a model-independent way. The other 
class of methods is based on forward modeling: one adopts a model for a galaxy (e.g., an elliptical 
Sersic profile, or a linear combination of basis images), simulates the observational procedure, and 
minimizes x^. Both approaches have their advantages and disadvantages. Much of the early WL 
work used moments-based methods, but for years a generally applicable PSF correction scheme 



^^Primordial gravitational waves can generate a B-mode on large scales, but such gravitaitonal waves are adiabat- 
ically damped on angular scales below a degree. Thus the ~ lO'-scale B-mode should be dominated by lensing. 
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seemed out of reach. Some of the more recent incarnations of the Fourier domain moments-based 
methods work for arbitrary distributions of galaxy and PSF profiles; however these are less mature 
in their practical implementation, and they impose stringent requirements on input data quality 
(e.g., sampling). The forward modeling methods can handle a much wider range of observational 
defects (e.g. under some circumstances one may even be able to measure a galaxy containing miss- 
ing pixels), but they depend on a model for the galaxy being observed; one must carefully assess 
the impact of an insufficiently general model. Both strategies require exquisite knowledge of the 
PSF. 

Currently there are many algorithms in use in each category. The pro totype mome nts-based 



method was that of Kaiser, Squires, and Broadhurst (KSB: lKaiser et al.lll995l : improved by lLuppino and Kaiser 



1997 : Hoekstra et al. 19981 ). Many improvements of these methods hav e been made 



computing better conversion factors from shear to quadrupole momentqfO ( Semboloni et al. . 2005 ). 



e^ 



m 



Ellip tical- weighted moments and the concept of s hear-covariance were introd uced by ( Bernstein and Jarvid . 
2003) and have been used extensively in SDSS (jHirata and Seliakl . l2003bl ). Further progress was 
made by moving to moments in Fourier space, where the PSF "correction" becomes trivial (one 
divides by the Fourier transform of the PSF, at least in the regions where it is nonzero). This 
has culmin ated in th e deve lopment of a shape measurement method that is exact in the high- 
SNR limit iBernsteinI toid ). We discuss this meth od and its d e velop ment in ^5.5.21 An early 
example of the model-fitting approach was lM2SHAPE lBridle et al.l (|2002l). More recently, Bayesian 



mode l fits have been introduced that are stable at lower SNR (jMiller et al.l . 120071: iKitching et al. 



2003 



20081'): these are currently being applied to the CFHTLS. The "shapelet" basis (jRefregierl . 

Refregier and Baconl . 120031 ). derived from energy eigenstates of a 2D quantum harmonic oscillator, 
is useful in both types of methods. The coefficients in a shapelet decomposition are moments, but 
one may also fit a model galaxy parameterized by its shapelet coefficients. 

The various shape measurement algorithms have be en tested and compared in blind simulations , 
such as the She ar Testing Program (STEP1/STEP2; iHevmans et~aD l200fil : iMassev et all l2007bl ) 
and GREAT08 ( Bridle et al. . 2010l ). In most of these cases, the objective is to minimize both the 
shear calibration error m (i.e. the error in the response to a given input shear) and the spurious 
shear c (i.e., the shear measured by the algorithm on an unlensed sample of galaxies). The STEP2 
simulations used typical ground-based PSFs and complex galaxy morphologies and found that many 
of the measurement methods had shear calibration errors \m\ of one-to-several percent, and spurious 
shear |c| ranging from several xlO~^ to several xlO~^. This level of performance should thus be 
considered typical of the more mature, heavily used shear measurement algorithms, although recent 
methods have done better. On the other hand, the algorithmic errors are only a portion of the 
error budget in a WL experiment — most importantly, the early simulation tests did not require 
participants to recover th e spatial variabi l ity of the PSF. Such a test is currently ongoing as part of 
the GREATIO challenge (jKitching et al.l . boid ). Early results from GREATIO are now available, 
but their significance is still being digested. 

In the remaining portions of this section we will discuss the mathematical problem of shape 
measurement ( ^5.5. ip and the basis for some of the commonly used methods ( ^5.5.2p and their 
statistical errors ( §5.5.3j) . We cannot of course do justice to every method that has been suggested 
or used. We have chosen to highlight the recent progress in Fourier-space methods, since in principle 
they provide an exact solu t ion in the limit of high SNR and are thus ripe for further development 
and utilization ( Bernstein . 2010l ). There are some biases that can result even for perfect shape 



measurement (or galaxies measured with a 5- function PSF), including the noise-related biases and 



"^ iMassev et"aD (|2007bl ) §3.1.1 gives an excellent technical review of the methods derived from KSB. 
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selection biases, which are probably present at some level for all known algorithms; these are 
discussed in ^5.5.4[ Finally ^5.5.51 describes the determination of the PSF, which is taken as an 
input for any shape measurement algorithm. 

5.5.1. The Idealized Problem 

The idealized shape measurement problem is as follows: we have a galaxy in the source plane 
whose surface brightness is /o(x), where x is a 2-dimensional vector in the plane of the sky. It is 
first sheared, i.e., the galaxy in the image plane is /(x) = /o(Sx), where S is the shearing matrix, 



1-7+ -7x 
-7x 1 + 7+ 



(101) 



(We assume I7I <C 1 here and work to linear order in 7 for simplicity, although higher-order 
corrections will be important for Stage IV surveys.) We do not observe the actual image on the 
sky, however — we observe it through an instrument with PSF@ G(x). The resulting image is 

I(x) = [/ * G] (x) = / /(xOG(x - xOd^x' = f /o(SxOG(x - xOd'x'. (102) 

This equation may also be written in Fourier space: if we define 

/(u)= / /(x)e-2™(i2^ f+ /(x)= / /(x)e2™(i2u, (103) 

then equation (jl02p simplifies to 

/(u) = G(u)/o(S-iu). (104) 

In practice, the image / is only obtained at discrete values of x, i.e., at the pixel centers spaced 
by separation A. If the image is oversampled, i.e., if the Fourier transforno of the PSF is zero 
(or negligible) at wavenumbers above some |u|max with lulmax < 1/(2A), then it can be sinc- 
interpolated to recover the full continuous function, 

/(X) = lin.A, n,A) sine ^l^^l^^ ,inc ^l^^l^^, (105) 

nira2 

The pixelization thus represents no special difficulty, except that the sine function has noncompaet 
support and must be smoothly truncated. A second implication of oversampling is that integrals 
of the form J P(x)Ii(x)/2(x) d^x, where P is a polynomial in the coordinates and Ii and I2 are 
oversampled functions, can be replaced without error by (infinite) sums over pixels: f — ?■ A^ 
Again, in practice such sums must be truncated. 

We will also define a critical wavenumber Ucrit) which is the smallest wave number for which 
there is a Fourier mode with G{u) = with |u| = Ucrit- Then we have G{u) 7^ for any |u| < Merit- 
This critical wavenumber determines the region within the Fourier plane over which deeonvolution 
is possible, and over which measurement of /(u) is possible. 



Here we use the term "PSF" to include not just the image of a point source produced by the telescope optics but 
also pointing jitter and detector effects. For example, if the detector has square pixels, the PSF is that delivered by 
the telescope convolved with a square top-hat function. 

*^It is important to recall that the definition of oversampling required for equation (|105|) operates in Fourier space. 
The commonly used condition for oversampling that the FWHM should exceed 2 pixels is a good rule of thumb for 
smooth profiles such a Gaussian, but it is not appropriate for general PSFs. 
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A shape measurement algorithm is a functional ^i[I]G\, i G {+, x}, that returns a shear 
estimate. When averaged over a population of galaxies with the same shear, such an algorithm will 
yield an expectation value 

{%) =Ca + {Sab+mabhb + 0{-f^). (106) 

Here Ca is called the additive shear error and niab is the multiplicative shear error or shear calibration 
error. An ideal algorithm will have Ca = rriab = 0. 

Many WL surveys take multiple exposures of each field; if they are oversampled, one may use 
equation (|105p to reconstruct a continuous function /(x) for each exposure. If the PSFs in each 
exposure differ (which they usually do), then to construct a stacked image, one can either apply 
a convolution kernel to each input image to make the PSFs the same or do a noise-weighted least 
squares fit to each Fourier mode /(u). If the individual exposures are undersampled (as is li kely for 



space-based data) an d appropr i ately dithered, me t hods are available in both Fourier space (iLauer . 



sp 



1999, ) and real space (i Fruchterl . l201ll : iRowe et al.l . |2011| ) to reconstruct a fully-sampled and hence 
continuous image /(x)l^'^l In either case, the problem is still one of measuring the shear from an 
ensemble of images of different galaxies. The one exception is that model- fitting shape measurement 
techniques can operate either on the combined images or via a direct fit to the raw input images. 
Even in this case, however, with many exposures (as planned for LSST) object detection will have 
to be carried out on the combined image in order to reach the full survey depth. 

One would intuitively expect that shape measurement becomes more difficult when the PSF 
is larger than the intrinsic size of the galaxy being measured. This is indeed the case. While the 
idealized problem of measuring shapes in the presence of a PSF is well-defined for any nonzero 
galaxy size, in practice both statistical and systematic errors blow up when the PSF becomes 
significantly larger than the galaxy. The extent to which the systematic errors in the high-SNR, 
^gai < '^psf regime can be addressed will likely determine the constraining power of large-etendue 
ground-based WL programs such as that planned for LSST. 

5.5.2. Shape Measurement Algorithms* 

The most obvious — but flawed — way to construct a shape measurement algorithm is to 
simply use the quadrupole moment tensor of a galaxy: one could compute 

Qij[I] = / /(x)(xi — Xj)(xj — Xj)(i^x, (incorrect) (107) 

where x is the centroid and the [/] implies that we compute the quadrupole moment on an observed 
image. It is easily seen from the properties of convolutions that Qij[f] = Qij[I] — Qij[G], i.e., one 
may obtain the pre-PSF quadrupole moment of a galaxy by subtracting the observed quadrupole 
moment from that of a PSF. Then one could construct the ellipticities of the galaxy, which are 
simply the trace-free components of the quadrupole moment normalized by the trace: 

rlf]- -Q22[/] . 2Qi2[/] , 

Qii[f] + Q22[f] '"'^ '^^^^"Qn[/] + Q22[/]- 

Since the quadrupole moment of / is simply related to that of /o via 

Q^Af] = is-'Ms-%Qki[M, (109) 



'^'^Much of the HST/COSMOS weak lensing work used the "Drizzle" algorithm (|Fruchter and Hookl. [200^ ') . which 
in general leads to a slightly different PSF in each pixel. However, this did not represent a limiting systematic for 
the ~ 2 deg2 observed in COSMOS. 
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we may derive the transformation law for ellipticities under infinitesimal shear: 



= e+[/o] + 27+ -e+[/o](7+e+[/o] +7xex[/o]) and 
ex[/] = ex[/o] + 27x -ex[/o](7+e+[/o]+7xex[/o]). (HO) 

It is then easily seen that the mean ellipticity of a population of galaxies that has an initially 
isotropic distribution of ellipticities - i.e. -P(e+, Cx) depends only on the magnitude \J e\ + e^. and 
not on the direction arctan(ex/e+) — is 

(Ca) = (2-erms2)7a, (HI) 

where e^^^ is the mean square ellipticity per component (+ or x). Since we work to first order in 
7, we may use the mean square ellipticity of the observed sources in equation (jllip . So the galaxy 
ellipticity divided by 2 — e^^^ is a shear estimator satisfying our desired conditions: by comparison 
to equation (jl06p there is no additive or multiplicative bias. 

The problem with this procedure is that the unweighted quadrupole moment, equation (jl07p . 
involves an integral over the entire sky, with a weight that increases oc as one moves away from 
the centroid of the galaxy. Therefore its measurement noise is infinite. It also fails to converge 
if the wings of the PSF decline as G(x) oc |x|~" for a < 4, i.e., it fails to converge for all PSFs 
realized in modern optical telescopes. Therefore equation (jl07p needs modification. 

A conceptually simple approach is to do a model fit to each galaxy. If one fits a model of an 
exponential or de Vaucouleurs profile galaxy with homologous elliptical isophotes, then one can 
obtain the quadrupole moment Qij[f] analytically from the model and hence the ellipticity of the 
galaxy. Modern model-fitting techniques can even fit more general radial profiles, or simultaneously 
fit bulge + disk models. Model fitting is also robust against many types of nastiness that occur in 
real data, such as dead pixels, cosmic rays, or nonlinear detector effects. However, model fitting 
assumes that the galaxy actually obeys the model — and especially at z > 1, the appearance of 
galaxies is not simple and they are not describable by simple analytical functions. At present, 
our best approach to understand what happens when simple model fits are confronted with com- 
plex galaxies is with simulations. One can even imagine "re-calibrating" these methods using the 
simulations, e.g. by subtracting the simulated Cj from each shear and multiplying by the matrix 
inverse of 5ij + rriij (see eq. I106p : but of course one is then relying on the galaxy population in the 
simulation to closely trace reality. 

One could also attempt to do a regularized deconvolution of the galaxy. The most popular 
such technique is a basis function technique: one writes the galaxy image as /(x) = X^„ftn^n(x), 
where {V'n} are a finite basis set and 6„ are the fit coefficients; this then becomes a model-fitting 
problem. A common choice is the "shapelet" basis, where the {ipn} are the energy eigenmodes of the 
2-D quantum harmonic oscillator (polynomials time s Gaussians); th i s requires (N + 1){N + 2)/ 2 
eigenfunctions to represent the O...A^ energy levels ( Refregier . 20031 : Refregier and Bacon . 2003 ). 



This basis is complete in the limit of large N , and the Gaussian endows the basis coefficients with 
simple transformation properties under translation and shear. Real galaxies often require very large 
to be well-represented, however, especially for cuspy profiles. 

A final class of ideas has been to note that any ellipticity formula that is shear- covariant in the 
sense of transforming via equation (jllOp enables us to use equation (jllip . For example, suppose 
that we had the galaxy image / before PSF convolution, and did an unweighted least-squares fit, 
in the sense of minimizing 

C= / [/(x)-/^odel(x|p)]2d2x. (112) 
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Here /model is an elliptical Gaussian to the image with free amplitude A, centroid Xj, and second 
moment matrix Qfj^* (6 parameters). Then Qfj^* and the ellipticities constructed from it would be 
shear-covariant — even if the galaxy's true radial profile does not resemble a Gaussianjf^ Early 
work on implementing this idea in the presence of a PSF attempted to determine the second mo- 
ment matrix of the image on the sky from the observed image and the PSF. For example, 
Gaussian galaxies and P SFs satisfy Qf^^l.f] = Qfj-^H/] — and so "no n-Gaussianity cor- 



rections" were introduced (|Bernstein and Jarvid . l2002l : iHirata and Seljald . l2003bl ) that yielded shear 



calibration errors of a few percent. But these methods were heuristic, and moreover they suffer 
from a fundamental limitation: Qij*^*!/] depends on very high-wavenumber Fourier modes u of the 
image, which are not preserved by the PSF, i.e. G(u) = 0. It is therefore mathematically impossible 
to determine Qij[f] from the data in a model-independent manner. 

To understand this point more fully, and illustrate a solution, let us imagine that we are doing 
an unweighted least-squares fit of a parameterized image /model (p)? using equation (I112p . For 
convenience, we will write the parameters as p = {^, Ugai, xi, a;2, e+, Cx }, where cjgai = (detQ)-'^/'* 
is a characteristic scale length of the galaxy, so that they have simple transformation properties 
under rotations. Written in Fourier space, it becomes 



[/(U) - /model(u|p)]' d'n, (113) 

and its minimum is given by the simultaneous solution of the 6 equations 

0= [ /(^) ^/^nodel(u|p) ^^^^^ 

where Pa is any of the 6 parameters. The problem occurs because (?/modei(u|p)/(?PQ, has support 
at |u| > Ucrit) where we cannot determine /(u). 

A solution to this problem has been proposed bv IBernstein (|2O10l M which is in principle exact 



in the low- noise limit and has been applied to simulations (but not yet to actual data). The key 
is to work in the Fourier domain, where the effect of the PSF is simple and the effect of the shear 
is as sim ple as in re a l spac e. We present the solution here in its most general form, and refer the 
reader to Bernstein for implementation details. The solution is to replace equation (1114p 



with 



= ta= / f{u)Wa{vi\p)d'u, (115) 

where Wi...Wq are weight functions. These should be envisioned to be qualitatively similar to 
the derivatives in equation (jll4p : but the only rules that we will impose are that: (i) the Fourier 
transforms VFq(u|p) have compact support, confined to |u| < Ucrit; and (ii) they are rotation and 
translation-covariant, e.g. changing the centroid parameter by 55c simply translates the function 
VFn-(x) — )■ VFq(x — (5x), and there is a similar transformation when rotating the ellipticity compo- 
nentslffl We do not require the Wa to be shear-covariant: indeed, since a large shear can map any 
mode to another mode with |u| > Ucrit) such a requirement would be inconsistent with rule (i). 
Now we may write equation (jllSp as 

0=/ f(.)^^.f.. (116) 



^^This is easily seen because the measure d^x in equation (|112|l is shear-invariant. 

^^See also [Kaiser ( 2000), which contains many of these ideas but seems to have been promptly forgotten by most 
of the WL comminity! 

^''Note that Wx must transform as a vector under rotations, and We as a spin-2 tensor. 
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The combination Wo^jG is well-defined, and /(u) is the Fourier transform of the observed image, 
so the parameters p can be measured from the data. 

By rule (ii), we have rotation covariance, so the mean of the ellipticities (e) over an isotropic 
population of galaxies is zero — even if the PSF is anisotropic. Thus there is no additive bias 
(except for selection and noise effects — see warnings below). However, dropping shear covariance 
has come at a price: the ellipticities (e+, Cx) no longer transform according to equation (jllOp . and 
the responsivity coefficient (e) = TZ'j must be determined. Fortunately, we can evaluate the effect 
of an infinitesimal shear on equation (1115^ : if S = 1 + 5S, then to first order in (5S, 

6f{n) = -u,[6S],,^^, (117) 



and so 



= 5t^ = -[ u,[5S],,^W^{u\p)d'u+l [ /o(u) ^^^^"'P^ d'u]spp. (118) 



The integral in braces {} is simply a 6 x 6 matrix, which we denote Eap. Using integration by 
parts, the tracelessness of 5S in the first integral, and the substitution f ^ I/G, we then find 

Sp, = -IE-'1„„[JS1« / |^I„,5!!y^lt) <i^„, (119) 

J|u|<«„it G'(u) duj 

which is well-defined. This equation tells us how the parameters for each galaxy vary under an 
infinitesimal shear; their ensemble average gives TZ. Note that once shear covariance has been 
dropped, it is only possible to know the responsivity factor TZ if one has a sample of real galaxies 
to observe, since one needs the sample of real galaxies to compute the matrix E^h- 



A related approach to solving the shear calibration problem was suggested by lMandelbaum et al 



They noted that given a high-resolution image of a galaxy (e.g., a space-based image) with 



PSF Gi, it is often possible to construct a lower resolution but sheared image of the same galaxy 
with PSF G2 in a model-independent way. One can thus directly test any shear estimator on the 
sheared images, and extract the shear calibration factor. Conceptually, the criterion for this to work 
is that all of the Fourier modes of the image observable using PSF G2 must be within the band limit 
of Gi with enough "padding" to make sure that the shear (which also shears the Fourier plane!) 
does not bring unobserved high-wavenumber modes not seen with Gi into the region seen by G2. 
Mathematically, the criterion for this to be possible are that there exist two critical wavenumbers Uc 
and Ud such that (i) all the power in the low-resolution PSF is below Uc, i.e. G2(u) = for |u| > Uc; 
(ii) the high-resolution transfer function (?i(u) is far from zero, i.e. 1/Gi(u) is well-behaved, at all 
|u| < Ud] and (iii) Uc > {1 — j)ud- Then one can use the Fourier-domain multiplication: 

/^^)(u) = G2(u)f (S-iu)/i(S-iu), (120) 

where r(u) = 1/Gi(u) for wave vectors |u| < Ud- As implemented, this method requires a higher- 
resolution image of a fair subsample of galaxies, which is not always available. It may however 
be quite useful in the Stage III ground-based program s, where one might use HST data for the 
"high resolution" image; see Mandelbaum et al. ( 201ll ) for a preliminary application of HST data 
to shear calibration in SDSS. 
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5.5.3. Shape Measurement Errors* 

The statistical uncertainty in ellipticity estimation depends on the method used and the radial 
profile of the galaxy, as well as the sizes of the galaxy and PSF and the SNR. Rough rules of thumb 
can however be obtained using nearly circular Gaussians. Propag ating instrument noise through 
the elliptical Gaussian fitting method, Bernstein and JarvisI ( 2002 ) find in the absence of a PSF 



a(e+[/]) = a{e^[f]) = ^ = - (121) 

r V 

where n is the flux noise variance per unit area, F is the galaxy flux, (Tj is the \a width of the 
galaxy (note: the effective radius of a Gaussian is 1.177(Tj), and v is the detection SNR in an 
optimal filter. In the presence of a circular Gaussian PSF, the ellipticity is diluted by 

2 2 

e[/] = ^e[/] = ^^e[/], (122) 

where oq is the PSF width and ai is the width of the PSF-convolved galaxy image. Furthermore, the 
detection SNR is reduced because the galaxy is smeared out into an aperture with more noise, so it 
follows that equation (jl2ip should be modified by replacing — )• o"/; and, if we want the uncertainty 
in the pre-PSF galaxy ellipticity, we must divide out the cr'j/crj factor from equation (I122p . This 
gives 

a(e+[/]) = a(ex [/]) = j, 2 = (^23) 

This provides a large advantage for making the PSF smaller than the galaxy: since the noise 
variance n scales with observing time as t~^, the time required to measure the shape of a galaxy 
scales as 




t DC a? DC 1 + ^ ; (124) 



in the limit of a poorly resolved galaxy (cxj <^ ac) a factor of 2 improvement in the PSF provides 
a factor of 64 gain in speed. However, as the PSF becomes smaller than the galaxy this advantage 
saturates. 

Equation (I122p also illustrates another property of shape measurement: systematic errors as 
well as statistical errors are inflated by having large PSFs. For example, if there is a systematic 
error in the ellipticity of the observed image /, it propagates to the estimated pre-PSF ellipticity 
e[/] with a multiplying factor of {aj: + ctq) / a'j . Therefore there is a systematics advantage to having 
ac ^ o-f. 

The shear uncertainty is a factor of ~ 2 smaller than the ellipticity uncertainty owing to the 
responsivity factor 2 — e^^^g (eq. Illip . It does however have a minimum value: the ellipticity of an 
individual galaxy has an RMS variation of erms ~ 0.4 per component, so there is a limiting "shape 
noise" contribution to the shear measurement uncertainty of w 0.2 (although there have been 
some ideas for how to circumvent this). 



5.5.4. Noise Rectification and Selection Biases* 

Two pernicious biases can arise even for the "exact" shape measurement algorithms described 
above: the noise rectification and selection biases. 

Noise rectification bias arises whenever a nonlinear transformation, such as ellipticity measure- 
ment, is applied to noisy data. If we Taylor-expand the mean of the ellipticity measured on the 
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true image lobs around the noiseless image /, we find 



(e[/obs]) = e[/] + \Y1 aj(xSj(x) ^°''[^^''"^'^^''^^J + ^^^^^ 

ab 

where the sum is over pairs of pixels in the image, and and x^ are positions of those pixels. The 
bias is proportional to the noise variance, i.e. to {S/N)~'^ at leading order. 

One might at first think that the pixel covariance is described by uncorrelated white noise, which 
is statistically shear-invariant and thus leads to no bias; but in the presence of a PSF correction 
[i.e. dividing by G'(u)] thi s is no l onger the case. The noise rectification bias was first recognized 
in the context of WL by iKaiser who showed that because the centroiding of a galaxy 



is more accurate on the "short" than the "long" axis of the PSF, there is a preference for the 
measured second moment of the galaxy to be elongated along the PS F, even if the PSF cor r ection 
method is perfect in the deterministic case. Thi s was generalized by Bernstein and Jarvid ( 2002 ) 



to incorporate other noise-related biases and by iHirata and SeljakI (120041 ) to include the effect on 



shear calibration errors. Equation ()125p provides a unified framework for computing all of these 
biases to order {S/N)~'^. At low S/N higher-order terms in the expansion may become important, 
and the expansion itself may break down, e.g., as fitting algorithms jump to alternate minima. 
It is our judgment that it is best to stay away from this "nonperturbative noise" regime. 

Selection biases are well-known in astronomy. In our case, they will affect the shear if there is 
a bias in favor of detection galaxies in some orientations rather than others, producing an additive 
shear error, or if selection depends on the magnitude of the ellipticity, which leads to a multiplicative 
shear error because galaxies a r e prefe rentially selected when their intrinsic ellipticity is aligned with 
the shear ( Hirata and Seliakl . l2003bl V A similar bias results if galaxies are weighted by various 



properties (e.g., ellipticity uncertainty) that are not shear- invariant. The formalism of ^5.5.21 can 
in principle handle this problem if instead of computing (e) we compute {we) where the weight 
w = for galaxies that are rejected. However, the assessment of selection biases in practice has 
been addressed through simulations such as the STEP program. 

A problem related to selection biases is blending: the superposition of images of two galaxies. 
If the galaxies are at the same redshift, they are affected by the same shear, and an ideal shape 
measurement algorithm that measures the blend should recover the "correct" answer — indeed, 
existing WL surveys must contain many sources that are actually blended with their own satellite 
galaxies. But if the deblending algorithm is not shear-invariant there can be a bias in the shear. 
Another issue, particularly for ground-based Stage IV experiments that will aim for high source 
densities at modest resolution and very small statistical errors, is accidental blending of galaxies at 
different redshifts (and hence different shears). 

5.5.5. Determining the PSF and Instrument Properties 

Shape measurement algorithms are only as useful as their inputs: in this case a map of the 
PSF G(x) at each point in the field. Determining the PSF to sub-percent accuracy is one of the 
major challenges in WL. Errors in the PSF model introduce correlated structure into the ellipticity 
field of the galaxies, since residual anisotropy in the PSF determination is interpreted as shear by 
a shape measurement algorithm. 

Fortunately, Nature has provided us with stars, which under typical observing conditions can 
be treated as point sources. Unfortunately, there is only a finite density of stars in high Galactic 
latitude fields, typically of order 1 per arcmin^, so one must interpolate the PSF to the position 
of a galaxy. This is a demanding challenge; any error in the interpolated PSF is likely to have 
spatial structure. It is also not an easy problem, as the PSF is an entire function G(x; 9) at 
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every 2-d position 9 on the sky, and in contrast to shape measurement, interpolation from stars is 
underconstrained0 To date, most of the methods appUed to real data are heuristic. For example, 
the SDSS analyses fit a low-order polynomial. 



N N-i M 



G(x; = ^ ^ ^ a,,fceie^2GW(x), 



(126) 



j=0 j=0 fc=l 



where the {G^^^^ are the top M = 3 principal components of the stellar images, = 2 is the 
interpolation order, and aijk are coefficients. Small scale structure in the PSF may not be well 
represented by this approach unless N is large, but if the required number of polynomial coefficients 
(A^ + 1)(A^ + 2)/2 exceeds the number of stars in each frame then the method falls apart. If the 
small-scale structure is repeatable, for example if it is associated with low-order aberrations in the 
telescope or the topography of the focal plane, then one may make progress by applying PCA to 
the angular dependence in instrument-fixed coordinates ( Jarvis and Jain . 20041 ) . choosing the top 
K modes out of the space of {N + 1)(A'^ + 2)/2 polynomi als. Recent work h as focused on improved 



interpolation schemes that outperform polynomials (e.g.. lBerge et al.ll2011 



F;r sp ace-based data, one can either build a physical model of the PSF tehodes et al.l . » 



or 

use PCA (jjee et al.l . 120071 ) . However, for ground-based data where the PSF has a large contribution 
from atmospheric turbulence, the more empirical interpolation schemes have been the methods of 
choice. 

Once one has the PSF, one needs a method of quality assessment. We need to be able to 
determine, or at least bound, the power spectrum of the residual PSF systematics that leak into 
cosmic shear results. (For GGL, this job is easier because residual PSF anisotropy adds noise but 
does not correlate with the positions of the galaxies.) One way is to do null tests: one can compute 
the correlation function of ellipticities of the stars and (supposedly) PSF-corrected galaxies, or 
search for S-mode shear. The latter is not foolproof, as a PSF systematic of E'-mode type can 
arise from some aberrations. A very attractive (but underutilized) test is to mask some of the stars 
in the PSF fitting and compare the interpolated PSFs at their locations to the observed stellar 
images. There are also methods for using combinations of these correlation functions to test for 
"overfitting" - the phenomenon in which a too-general PSF model begins to fit noise or sma ll-scale 
struc ture in the stellar images, with the effect that the interpolated PSF is actually worse (jRowd . 
2O10l ^. 

Even when this is done, there remain two other errors that have received increasing attention 
recently, which may cause the PSF of a galaxy to differ from that of a star: 

• Color dependence: Real PSFs depend on the wavelength of light: a diffraction-limited tele- 
scope has a PSF size that scales oc A, seeing through a Kolmogorov atmosphere gives a size 
DC A~^/^, aberrations introduce A-dependence into not just the size of the PSF but its mor- 
phology and radial profile, real detectors have response functions that depend on wavelength, 
and in ground-based data atmospheric dispersion acts like a prism and causes a centroid shift 
with wavelength. Since galaxies have different SEDs than the stars used to fit the PSF (the 
galaxies are usually redder), the P SF measured from the stars is not always appropriate to 
the.alaxies ^Cvnriano et all B . Moreover, each galaxy contains a range of SEDs due to 
differing stellar ages and metallicities and dust columns; and for each of these SEDs t here is 
a different PSF, with the resulting images superposed on the focal plane (jVoigt et al.l . 12011 



^''As a reminder, here x is used to refer to the location within the image of each star, i.e., of order 
the independent variable 6 accounts for variation across the entire field, of order ^ 1°. 



1", whereas 
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This represents a major challenge: while the centroid wavelength of galaxies in a typical 
filter (A/ A A ~ 5) varies by several percent, Stage IV surveys will require sub-percent shear 
calibration accuracy, and as yet no WL survey has published a color correction to its PSF at 
all. While this area requires much more work, in a problem of this complexity prevention is 
the first step to a cure. For example, one can employ an atmospheric dispersion compensator 
on the ground, and one can use narrower filters. For smooth spectra or spectra averaged 
over moderate redshift ranges, the variation of wavelength centroid varies as oc AA^. The 
advantage for narrower filters must of course be weighed against the slower survey speed with 
smaller AA. If many observations of a given field are taken under varying conditions, as 
planned for LSST, one has the intriguing possibility of using the different seeing, focus, or 
hour angle dependences of these errors to solve them out and distinguish them from shear. 

Detector effects: A CCD image can be altered during readout due to finite charge-transfer 
inefficiency. This results from photoelectrons that temporarily bind to defects ("traps") in 
the material, causing each sky object to leave a trail along the readout direction. If not 
corrected, this can appear as a PSF anisotropy, and it is greater for faint objects than for 
bright objects because some of the traps saturate, so stellar images tend to underestimate 
the effect. Charge transfer inefficiency is primarily a concern for space-based data, since the 
principal cause of traps is radiation damage. Remedies applied to HST data have included 
empirical corrections to the galaxy ellipticities as a function of row and magnitude, or pixel- 
level correc tions that beg i n by "undoing" the charge transfer inefficiency before subsequent 
processmg dMassev et al.l . l2ninl : iRhodes et all , hoidi ). 



• Near-infrared detectors (as planned for WFIRST) suffer from different potential detector- 
induced systematic errors than CCDs. In particular, since the image is read "in place" on 
the detector rather than transferred out as on a CCD, there is no charge transfer inefficiency. 
Instead the major concerns have been interpixel capacitance (the sensitivity of the voltage on 
each pixel to the charge collected in neighboring pixels); persistence (a charge trapping and 
release phenomenon in which each exposure contains a small amount of residual signal from 
previous exposures); and reciprocity failure (a detector exposed to twice the signal for half 
the time does not produce the same response, which represents an issue for comparing the 
images of stars and galaxies of very different magnitudes). 

5.6. Astrophysical systematics 

The principal advantage of weak lensing is that — despite its technical difficulty — it is directly 
sensitive to mass. It is thus less affected by astrophysical uncertainties than other probes of cosmic 
structure such as the galaxy power spectrum or cluster counts. However, it is not entirely free 
of astrophysical contamination. The two major sources of uncertainties in this case are intrinsic 
galaxy alignments, which can mimic the coherent distortion of galaxies by gravitational lensing, 
and the prediction of the matter power spectrum. 

5.6.1. Intrinsic Alignments* 

We have thus far assumed that the intrinsic ellipticities of galaxies are independent, adding 
noise but not spurious signal to cosmic shear measurements. However, the orientations of galaxies 
are determined by physical processes — mergers, torquing by tidal fields from the host halo and 
large scale structure, etc. — that could produce correlated intrinsic alignments. We first describe 
here the general formalism for the impact of intrinsic alignments, then consider what observations 
and theory have taught us about them. We conclude by discussing prospects for intrinsic alignment 
removal. 
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The field of intrinsic galaxy ellipticities is a tensor function e(r, n) of position r and viewing 
direction n. In this sense it is very similar to CMB polarization. In principle it also depends on 
the type of galaxy under consideration and on the observational details — for example, the B 
and /-band images of a galaxy could have different ellipticities. We may also discuss either the 
unweighted intrinsic ellipticity field Cunwt or the field weighted by the galaxies, 

ewt(r,n) = [1 + g{r)]eunwt{r,n), (127) 

where g = (rigai — n)/n is the galaxy overdensity. In what follows, we use e to denote the galaxy- 
weighted field e^t, since this is most closely related to what one observes in a survey. 

Like any other field, e can be Fourier transformed to give e(k, n), with a power spectrum 

(e:(k, n)efe(k', n')) = (27:)^^'^ (k - k')Pe;afe(k, n, n') , (128) 

where a, b are spin-2 tensor indices. Here we break from the train of reasoning in CMB polarization 
studies: instead of doing a multipole decomposition of e, we note that in the Limber approximation 
(which we use exclusively here) there is only one relevant viewing direction - the direction to the 
observer — so n = fi'. Moreover, the Fourier wave vectors that we care about are perpendicular to 
the line of sight, so k-fi = 0. We will thus write this particular configuration as simply Pee:ab{k)- An 
E/B mode decomposition is also possible if we rotate the coordinate basis so that the E'-component 
of ellipticity is aligned along the direction of k and the S-component is at a 45° angle; we then 
have two ellipticity power spectra, Pf^{k) and P^^{k) (the £'i?-term vanishes by parity). One 
can also write correlations of the ellipticity with scalar fields such as the galaxy or matter density. 
In this case, only the E'-mode can be correlated, and we write P^s, Peg-, etc. 

The measured shear on the sky is a superposition of the WL shear and the intrinsic ellipticity 
(converted to shear using the algorithm-specific responsivity factor TZ): 

7obs(^) = lie) + p{Dc)e{e, Dc) dDc. (129) 

Limber's equation can then be used to obtain the observed shear power spectrum between the a 
and /3 redshift slices. The ii^-mode contains three terms: 

Cfj,{l; obs) = Cfj,il; GG) + C^^(/; GI) + G'^^j.il; II), (130) 

where the GG term is the gravitational lensing shear contribution, // is the intrinsic ellipticity 
contribution, and GI is the cross-correlation. The GG term is the desired signal and is given by 
equation ()80p . The other terms are 



CeUi-,II) = ^/ Pa{Dci)pp{Dci) Pe''{k^^l/DAi) ^^^^ ^^3^) 



and 



G%%{1- GI) = ^ [ [W,sADci)p^{Dci) + W,sADci)Pa{Dci)] ^^^^^-^l^dDci- (132) 

There is also an // contribution to the i?-mode power spectrum similar to equation (jl3ip . Since 
there is no i?-mode gravitational shear, there is no GG or GI contribution to the i?-mode power 
spectrum. 

Several generic features can be noted from these equations: 
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• The // contribution to the cross-spectrum is only nonzero if the two redshift distributions 
overlap, since it arises from intrinsicahy ahgned galaxies at the same redshift. Therefore, 
if low-scatter photo-zs are available, it can be eliminated. This is one motivation for doing 
tomography instead of simply measuring the shear power spectrum on a magnitude-limited 
sample of galaxies. 

• The GI contribution is more troublesome. It arises from the lensing of the more distant slice 
by the same matter field that controls the intrinsic ellipticity of the nearby slice. Inspection 
of the properties of the window function Weff shows that equation (jl32p is nonzero for all 
redshift distributions, unless either Pesik) = or the redshift distributions are (5-functions at 
the same redshift, which would dramatically enhance II. Thus we are led to conclude that 
every tomographic power spectrum can suffer intrinsic alignment contributions. 

• i?-mode "shear" can be generated by intrinsic alignments, and if nonzero C^^(/) is observed 
this is one possible explanation (PSF model errors are another). However, it is possible for 
the intrinsic alignment contribution to the £^-mode to be much larger because (i) there is no 
theoretical reason from galaxy formation to expect Pf^{k) ^ P^^{k) — indeed, we will see 
below that Pf^{k) ^ P^^{k) may be natural — and (ii) C^^(0 can also contain a GI term, 
which for broad redshift distributions usually dominates over II. Therefore a nondetection 
of i?-mode shear does not rule out significant intrinsic alignment contamination. 

Before we discuss removal of intrinsic alignments, it is helpful to consider the physics underlying 
their power spectra. One can distinguish two cases: early-type galaxies, which are triaxial and 
whose intrinsic ellipticity is presumably related to the direction of the most recent merger or 
the direction of anisotropic collapse (depending on one's idea of how these galaxies are formed), 
and late-type galaxies, whose ellipticity is determined by the disk angular momentum (perhaps 
acquired via tidal torquing during collapse, reshuffled by disk-halo interactions, and perturbed by 
minor mergers). The detailed physics of these processes remains elusive, but some predictions can 
still be made by traditional galaxy biasing arguments. For example, if one considers the formation 
of early-type galaxies in a particular region of the universe, one could argue that at linear order in 
the large-scale density field a galaxy's formation sequence can be sensitive only to the density and 
tidal field coming from the linear modes, and to small-scale structure. Since only the tidal field has 
the correct symmetry properties to be related to an ellipticity, it follows that the ellipticity should 
be proportional to the tidal field, 

where Ci controls the strength of alignment and di and 82 denote derivatives along two orthogonal 
axes on the sky. This implies that the ellipticity traces the density field, and in particular 

P,^^{k) = CfP5{k), Pf^{k) = 0, and PeS = CiPs{k). (134) 



Equations (|133^ I134p are known as the linear alignment model ( Catelan et al. . 200ll ). Note that 



they predict only E-mode intrinsic alignments, because the alignments are linearly sourced by a 
scalar field Pj 



®^If one interprets the model of equation (|133p as applying to the unweighted ellipticity field, then converting to a 
galaxy-weighted field introduces a _B-mode. However it is much smaller than the i5-mode signal and vanishes in the 
linear regime. 
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Observations of LRGs in the SDSS have shown that the galaxy-elhpticity correlatioij^ 



has the same power-law s lope as the galaxy correlation function Wg{rp 



cx r 



-0.7 



(Mandelbaum et al 



20061 : iHirata et al.l . 120071 ) , with an amphtude that increases rapidly with LRG luminosity. This is 



a quantitative success of the hnear model. However, on small scales it is not clear how accurate 
equation (jl34p should be. 

For late- type galaxies, it is less clear what to expect. The oldest and most widely discussed 
model is that disk galaxies acquired their angular momentum from tidal fields acting on nonspherical 
protogalaxies, an effect that would make the result ing intrins i c ellip ticity quadratic in the tidal field: 
this is known as the quadratic alignment model ( Pen et aP . 2000) • This model produces both E 
and .B-mode II signals, but to leading order it predicts Pesik) = and hence gives no GI signal 



([Hirata and Seljakl . 12004 ). However, one should be cautious about this argument for several reasons, 
most importantly because there has not yet been any quantitative observational confirmation of 
the scale and configuration dependence predicted by the quadratic model, and additionally because 
perturbation theory a rguments show that t he nonlinear evolution of the tidal field can generate a 
linear type alignment ( Hui and Zhang . 20081 ). What is clear from observations is that the alignments 



of late-type galaxies on large scales, at least as measured by Wgeirj,), are con sistent with zero and 



are certainly much less than for LRGs (jHirata et al.l . 120071 : iMandelbaum et al.l . l201ll ) . 



Detailed assessments of the intrinsic alignment contamination have been made on the basis o f 



SDSS, 2SLAQ, and WiggleZ observations oiwge{rp) (jHirata et al.1 . l2007l : Mandelbaum et al.l . 12011 
These studies show that for surveys of modest depth (zmed ~ 0.7) the GI contamination may be 
up to several percent for late-type galaxies if it is near current upper limits, and it could be tens 
of percent for LRGs. As one probes to higher source redshifts the level of contamination becomes 
increasingly uncertain, because there are not yet galaxy surveys at z > 1 that are capable of probing 
intrinsic alignments at interesting levels. The // contamination for broad redshift distributions is 
found to be much less than GI for linear alignment models. 

Finally, we consider the methods used to remove intrinsic a lignments. One starts with preven- 
tion: in the recent COSMOS analysis, Schrabback et al. ( 20ld ) suppressed II by throwing out the 
auto-power spectra of each of their redshift slices with itself, keeping only the cross-spectra. They 
also suppressed GI by not including LRGs in the foreground redshift slice, since LRGs contribute 
the most to the contamination. However, it is not clear that sample selection alone will provide 
sufficient GI rejection for Stage III surveys and beyond. Two general approaches to GI rejection 
have been proposed, model- independent and model-dependent. 

The model-independent GI rejection method is to note that if we have narrow redshift bins, 
and denote the foreground and background slices by Za and zp respectively, then the GI signal 
depends only on intrinsic alignments at Za (i.e., in the nearer bin). Then at fixed Za the GI signal 
is proportional to 

(135) 



G'g^{l] GI) OC COtK DciZa) - COtx Dc{Zf3), 

which becomes small if zp — Za is small. This is a different redshift dependence than the GG 
signal, which is a linear function oi coIk D ciza) but remains finite as Z/j — t- Za- Hence it could 
be projected out (jHirata and Seliakl . |2004| ) — e.g., one could take the a/3 shear cross-spectrum 
at several background bins and extrapolate to Za- An alternative implementation of this idea 
is nulling, constructing a synthetic redshift slice by weighting of the different z^ whose window 



®^This has been measured as the hne-of-sight integral of the correlation function, w{rp) where rp is the transverse 
separation, which contains the same information as the power spectrum. 
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function 

Wefr,syn[^c(^a)] = W pW[Dc{Zo) , Dc{zp)\ = 0. (136) 

Clearly some of the weights wp must be negative. This class of techniques assumes nothing about 
the physics of intrinsic alignments, but because of the extrapolations or negative weights it can 
amplify observational systematics, and to date it has not been successfully implemented on real 
data. 

A model-dependent alternative, less demanding in terms of observational systematics, is to con- 
struct the 3x3 symmetric matrix of power spectra of the matter, galaxies, and intrinsic ellipticity, 

/ Ps PgS Pes \ 
P{k) = PgS Pg Peg (k). (137) 
\PeS Peg P^ ) 

This has six free functions of wavenumber, of which one {Ps) can be predicted from cosmological 
parameters. However, since the tidal field is determined by the matter distribution, if galaxy 
alignments are really determined by the tidal field then they should not additionally care where the 
other galaxies are: the conditional probability distribution Probfej 5, q) =Prob(e|5). In this case, 
and in the limit of a Gaussian field, one should have the restrictioro 

Pes{k) = ^^fegik). (138) 



This relation was assumed by the DETF in their WL parameter forecasts ( Albrecht et al.l . 20061 ) 



and if valid it is very useful because it relates the GI contamination {Pes) to theory (Ps), GGL 
{Pgs), and galaxy-ellipticity correlations at the same redshift {Peg)- Unfortunately, its accuracy 
is unclear in the nonlinear regime, since for non-Gaussian density fields, Prob(e|5, (7) =Prob(e|5) 
no longer implies equation (jl38l) : an investigation of this in simulations should be a high priority. 
Nevertheless, equation (jl38p may be usable if the GI correlation for late- type galaxies turns out to 
be far below current upper bounds, in which case even a crude correction could reduce it to below 
statistical error bars. 

Intrinsic alignments also represent a contaminant to GGL if the "lens" and "source" red- 
shift distributions overlap; some of the "sources" may then be physically associated with the 
lens and show an alignment that is a result of galaxy formation physics rather than lensing 



(jBernstein and Norbergj . 120021 : iHirata et al.l . l2004bl ) . However, in this case the availability of good 



photo-zs solves the problem, since for GGL there are only // alignments, which can be eliminated 
by restricting cross-correlations to non-overlapping redshift slices. 

5.6.2. Theoretical uncertainties in the matter power spectrum* 

An important systematic error in weak lensing is the prediction of the cosmic shear power 
spectrum, which — although far more theoretically tractable than galaxy clustering — is not free 
of uncertainty. WL gets most of its information from the nonlinear regime, where the only way to 
accurately predict the power spectrum is using large A^-body simulations. At the present time, most 
WL constraints have used physically motivated fi tting formulae calibrated to A^-body simulations 
(e.g., IPeacock and Doddsl[l996l : ISmith et al.ll2003l ). but these have limited accuracy because of the 



^^This is a slightly more general relation than assuming that the galaxies are linearly biased with no stochasticity, 
in which case one could replace Ps{k)/ Pgs{k) — >■ 1/6. 
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limited resolution and box size of the simulations and the limited ranges of cosmological parameters 
that have been explored. The situation has improved dramatically in recent years thanks to Moore's 
law and the fact that the "interesting" region of parameter space has shrunk considerably. Much 
improved nonlinear m atter power spec t rum c alculations have been obtained from the "Coyote 
Universe" simulations ( Heitmann et al. . 2009 . 20ld : Lawrence et al. . 20101 ). Given this progress, 
and given that the iV-body problem is perfectly well-defined mathematically, we expect that the 
theoretical uncertainty in the power spectrum for pure dark matter models will not be a limiting 
systematic for WL. 

The situation is more complicated when one goes beyond pure dark matter. Baryons make 
up ~ 17% of the matter in the universe, and on small scales they do not trace the dark matter. 
Hydrodynamic simulations can follow them, but one cannot hope to model the processes of cool- 
ing, star formation, metal enrichment and feedback from first principles. On quasilinear scales, 
k ~fewx0.1/i Mpc~^, the largest uncertainty appears to come from clusters, where redistribution of 
the radial distribution of baryons affects the 1-halo contribution to the power spectrum. Observa- 
tions o f clusters — i n par ticular measurement of cluster concentrations — may help to constrain this 
effect ( Rudd et al. . 20081 ) . It has been proposed to either "self-calibrate" the cluster profile effect 



(IZentner et al.l . l2008l ) or incorporate information from cluster-galaxy lensing (iMandelbaum et al 



20081 ). although this has not yet been necessary for present cosmic shear experiments. 

A second uncertainty is associated with the missing baryon problem — the fact that most of 
the baryons that should be in galaxy-sized haloes (assuming a cosmic baryon:CDM ratio) are 
not observed in the stellar, H i, and molecular gas components. If these baryons have been 
ejected from the host halo, e.g. via galactic winds or AGN feedback, then they could reduce 
th e matter power spec t rum. These effects were discussed in an idealized "nightmare scenario" 
by iLevine and GnedinI (120061) : rnore recently, detailed hydrodynamic simulations have been car- 
ried out, and van Daalen et all ( 2011 ) find a suppression of the matter power spectrum by 1% at 
k = 0.3/i/Mpc and 10% at k = 1/i/Mpc. Howe ver, it seems that the su ppression can be captured 



by a parameterized halo model for the baryons (" Semboloni et al.l . 1201 ll ). 



A final issue is the accuracy of the leading-order mapping from P/i(k,z) to the shear power 
spectrum, equation ()72p . Next-order perturbation theory arg uments krause and Hirat J . B) 
suggest that the correction is small, only a few a for Stage IV experiments. Ultimately, this 
correction should be computed with ray-tracing simulations that solve the full defiection equation. 



5. 7. Systematic Errors and their Amelioration 

Summarizing results from our earlier discussion, the principal systematic errors in weak lensing 
measurements are: 



PSF correction and shape measurement biases (§ §5.5.2H5.5.5|) : For the typical case of a PSF 
of a similar size to the galaxy, the correction of the galaxy ellipticity for PSF effects is of 
order unity, and the desired accuracy is < 10~^. This requires both very accurate knowledge 
of the PSF and appropriate algorithms to correct for it. 



Redshift distribution uncertainties ( H5.4.3\) : Using source galaxies at a higher redshift in- 
creases the WL power spectrum, and hence there is a degeneracy between ^source and the 
inferred cosmological parameters. If the source redshift distribution (or distributions, in the 
case of a tomographic analysis using photometric redshifts) are not well-calibrated, there is 
a resulting error in the inferred cosmology. 



Intrinsic alignments f ^5.6.1\) : The ellipticities of nearby galaxies may be correlated with each 
other due to formation in a common environment. This "/I effect" adds to the observed shear 
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power spectrum and represents a systematic error. There can also be cross-correlation of the 
intrinsic ellipticity of a nearby galaxy at z = zi with the lensing signal on a more distant 
galaxy at z = Z2 > zi, since both depend on the tidal field at z = zi. This latter "G/ effect" 
contaminates all tomographic cross-power spectra and hence is more difficult to remove than 
//. 

• Matter power spectrum uncertainties ( ^5.6.2\) : Predicting Pm{k, z) from a set of cosmological 
parameters is a nontrivial task. This can now be done accurately for dark matter only 
cosmologies (assuming that the dark matter behaves as simple CDM), but on small scales the 
influence of baryonic physics (cooling, feedback, etc.) is difficult to model. 

These errors, and the steps to remove them, are not independent — for example, marginal- 
i zing out the intrinsic alignment effects can amplify systematic errors in photometric redshifts 
(|Bridle and Kind . I2OO7I ) . The development of systematic error budgets and requirements for future 



surveys thus requires a , global analysis of all of the statistical and systematic uncertainties and their 
possible degeneracies (jBernsteinl . boogl 'l. 



We have described numerous strategies for suppressing most of these effects, but a few features 
stand out. First, exquisite knowledge of the PSF must be achieved through some combination of 
good engineering (designing a stable telescope and instrument and putting it in the best possible 
environment), good choice of observing strategy (more dithers and repeat visits), and good algo- 
rithms (one needs to generate a homogeneous catalog with well-understood ellipticity errors and 
selection effects). Second, precise photo-zs over the entire range covered by the survey are desirable 
for characterizing the redshift distribution, and they are required if one is to even attempt a model- 
independent removal of intrinsic alignments — something that has not yet successfully been done. 
To achieve these photo-z requirements, one wants optical and near-IR photometry to distinguish 
Balmer/4000A breaks from Lya breaks and spectroscopic samples that span the full range of the 
WL sample s. Cross-correla tion against large redshift surveys can be an important tool in photo-z 
calibration ( Newman . 20081 ) . 



5.8. Advantages of a Space Mission 

A space platform offers two critical advantages for weak lensing: (i) the availability of a small 
and stable PSF, and (ii) the low sky brightness in the near-IR, which allows deeper observations. 
For this reason, weak lensing has been highlighted as an important science objective for the Euclid 
and WFIRST space missions. 

The small PSF enables the telescope to resolve many more galaxies (see Fig. [TBI) . The space- 
based PSF size is normally determined by the diffraction limit: for an ideal Airy disk with an un- 
obstructed circular aperture (off-axis telescope), the 50% encircled energy radius EE50 is 0.535A/Z). 
This worsens to 1.25A/D for an obstructed aperture with secondary mirror and baffle that block 
50% of the radius of the system. Nevertheless, for typical A (of order 0.8 fim for a visible mission 
and 1.5 /xm for a near-IR mission) and reasonable telescope size {D = 1.1 — 1.5 m) the EE50 radius 
is several times smaller than the typical ~ 0.3 — 0.4 arcsec from a good ground-based site. There 
are additional contributions to the PSF size - charge diffusion, the pixel tophat, aberrations, and 
pointing jitter - but on a space weak lensing mission these would be designed to be subdominant 
to diffraction. 

A perhaps more important advantage is the stability of the PSF on a space mission, which allows 
for better characterization. The dominant contribution to a ground-based PSF is from atmospheric 
turbulence, which varies rapidly as a function of time and field position. This is eliminated in space. 
Moreover, contributions to the optical distortions from temperature variations and gravity loading 
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can be reduced or eliminated, particularly at the L2 Lagrange point (the preferred location for 
a space mission). The three dominant contributions to PSF ellipticity on a space mission are (i) 
astigmatism, which causes the ellipticity of the PSF to vary with focus position; (ii) coma from 
misaligned optics, which at second order leads to ellipticity; and (iii) anisotropic pointing jitter. 
Of these, (i) and (ii) are functions of mirror positions, whose time and field position dependence 
are controlled by a small number of parameters. The pointing jitter is the least stable - it may 
be different in every exposure — but it has a controlled position dependence, no color dependence 
(at least with all-reflective optics), and can be monitored with the same fine guidance sensors 
used to point the telescope. Therefore a space mission offers the possibility of a PSF whose entire 
structure is dete rmined by a small number of parameters that can be tracked as a function of time 
( Ma et al. . 20081 ). This means it provides the best possibility of providing accurate PSF knowledge 



at every point in every exposure. The diffraction PSF has the unfortunate feature of having a 
size that is highly color-dependent (oc X/D), and in the presence of aberrations the ellipticity is 
color-dependent as well. However, in contrast to ground-based observations, the color dependence 
is controlled by the same wavefront error that determines the PSF morphology. 

As already noted, optimal photo-z performance across the entire relevant range of redshifts 
can be obtained only with continuous coverage from blueward of the 4000 A break (at z = 0) 
through the near-IR. In particular the Balmer/4000 A feature is always detected except at very 
high redshift {z > 3), which reduces the number of objects with no breaks identified and provides 
cleaner separation of the Lya versus Balmer/4000 A breaks. Collecting photometric data points in 
the bluer bands (starting at the ~ 3200 A atmospheric cutoff) is quite reasonable from the ground, 
and in this area there is no major advantage to a space mission. However, as we move to the red the 
space mission begins to look much more attractive. From the ground, the near-IR sky brightness 
(relevant for broadband imaging) is dominated by the decay of OH radicals, which are produced 



i n vib rationally excited states at ~ 90 km altitude in the Earth's upper atmosphere iLeinert et al 
( 19981 ). The typical sky brightness rises from 18.5 mag AB arcsec"^ in the Z band through 15.4 
mag AB arcsec"^ in the H bandjfl In space in the 1-2 /xm region the dominant background is 
instead scattering of sunlight off of interplanetary dust particles (the "zodiacal light"). The typical 
brigh tness is ^ 23 inag AB arcsec"'^ near the ecliptic poles and 21.5 mag AB arcsec"^ in the ecliptic 
plane Leinert et al. ()l998l ). Thus in the H band the sky brightness is a factor of 300-1000 lower in 
space, which means that a space telescope with even ~ 1 m^ collecting area would outperform the 
best ground-based telescopes in terms of near-IR imaging survey speed. Note also that because of 
the altitude of the OH emitting layer, airplane or balloon based platforms cannot access the low 
background available in space. 

5. 9. Prospects 

The next several years promise to be very exciting for weak lensing as we enter the Stage III 
era. Two major wide-field ground-based imagers are coming online in the 2012/13 timeframe: the 
Dark Energy Gamer (DECam) at CTIO in the So uthern Hemisphere, an d the Hyper Suprime 
Cam (HSC) on Subaru in the Northern Hemisphere ( Miyazaki et al. . 20061 ). These will provide 



great leaps in etendue, roughly 35 m^ deg^ for DECam and 70 m^ deg^ for HSC (versus 8 m^ 
deg^ for CFHT/MegaCam). The Dark Energy Survey (DES; using DECam) plans to observe 5000 
deg^ in the grizy bands over five years to ~ 24th magnitude (lOcr r band AB, shallower in y). The 
HSC plans a somewhat deeper and narrower survey, also in grizy (2000 deg2, 25th magnitude lOcr). 



®*See the WFCAM website, Ihttp : //casu. as t . cam. ac .uk/surveys-projects/wf cam, and beware of Vega to AB 
conversions, which are significant in the near-IR. 
^^ ,http : //www . darkenergysurvey . org/ ^ 
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These projects together will measure the shapes of roughly 300 million galaxies and provide accurate 
photometric redshifts out to z ~ 1.3; this represents a l| order of magnitude increase relative to 
current data sets. We expect that the use of several revisits and shape measurements in multiple 
bands, as well as incorporating the lessons from Stage II WL projects such as the CFHTLS and 
SDSS, will provide additional control over systematic errors in shape measurement. With careful 
attention to the source redshift distribution as well, and the photo-z capability provided by y-band 
imaging, the Stage III cosmic shear projects (DES and HSC) should reach the 1% level of precision 
on the amplitude as, as well as providing high-5/A^ measurements of its increase as a function of 
cosmic time. If the stochasticity issue turns out to be tractable, a similar level of precision will be 
reached by using galaxy-galaxy lensing to constrain the bias of galaxies and infer erg indirectly from 
galaxy clustering. 

The Stage III projects will also mark the completion of the research program of extrapolating 
the amplitude measured from the CMB forward in time and comparing it to the value of 
measured via WL, and using the agreement of the amplitudes to measure w{z) or test GR. There 
is a fundamental limitation to this type of comparison coming from reionization: while Planck will 
measure the CMB power spectrum to very high accuracy, one needs the optical depth r to convert 
this into a normalization of the initial perturbations. This seems unlikely to be measured from the 
CMB E-mode to significantly better than 0.01 due to cosin i c variance, foregrounds, and m odeling 
uncertainties (jHolder et al.l . l2003l : iMortonson and R\l boOSi : IColombo and Pierpaoli l2009l )F°l 

The completion of the DES and HSC will not, however, mark the end of the road for cosmic 
shear. Because of the reionization degeneracy, the next step will be to make highly accurate mea- 
surements of the shape of the signal (dependence on scale and redshift) rather than its amplitude. 
This is a scientific matter of critical importance: if DES/HSC find a convincing deviation from the 
expected amplitude of low-redshift structure, one does not know whether this refiects a breakdown 
of GR at late times (a phenomenon that might be linked to cosmic acceleration) or something that 
happened to alter the growth of structure between z ~ 10^ and z ~a few (such as massive neutrinos, 
though early dark energy would also be a possible explanation). What is needed next is the survey 
that measures the rate at which the growth of structure is suppressed internally to the low-redshift 
data. In our ^ forecasts we describe deviations from the GR-predicted growth rate using the pa- 
rameter A7 (see eqs. [Island [45]) . though other choices a re possible. Eve r i the Stage III surveys may 
make only preliminary measurements in this direction. Albrecht et al. ( 20091 ) estimated that DES 
could measure A7 to a la accuracy of only 0.2 using the evolution of the WL signal, and our fiducial 
Stage III forecast in ^8.31 yields a/\-y = 0.148 (see Table [9]). Clusters calibrated by stacked weak 
lensing might enable a significantly tighter constraint ( §8.41 Fig. I44l) . and redshift-space distortions 
could also enable good measurements of A7 ( ^7.2p . It is not clear, however, that any method will 
achieve percent-level measurements of the rate of low redshift structure growth in the 2010 decade. 
Reaching this goal is one of the major drivers for Stage IV projects using WL and other probes 
of structure growth. It requires highly accurate, low-systematics shape measurements, of galaxies 
across a wide range of redshifts, including z > 1 where the angular radii of galaxies are small and 
the shape measurement challenges are immense. 

Fortunately, the Stage IV WL experiments are already being planned, although their first 
light is not expected until > 2020. There are several approaches. One is the Large Synoptic 
Survey Telescope (LSST), which would feature a giant-etendue (290 m^ deg^) telescope dedicated 
to optical surveys of the Southern Hemisphere. Over a ten-year operating period, LSST would 



In the more distant future, 21 cm measurements may improve our understanding of reionization to the point 
where this hmitation is removed; however such an advanced understanding is not anticipated in the immediate future. 
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acquire hundreds of images of every point on the sky, which should go a long way toward identifying 
and removing any residual sources of PSF systematics. The incorporation of 6 bands (ugrizy) will 
likely lead to the best photometric redshifts practical from the ground over such a wide area. LSST 
will survey the entire extragalactic sky available from the south, perhaps 12,000-15,000 deg^. The 
usable density of source galaxies, particularly at high redshift, is not certain as it depends on 
both advances in measuring galaxies small compared to the PSF and the quality of photometric 
redshifts in the notorious 1.3 < z < 2 range. However, by achieving high S/N on almost every 
resolved galaxy, LSST is likely to represent the "ultimate experiment" for ground-based optical 
weak lensing. 

An alternative approach is to exploit the small and stable PSF and availability of the near-IR 
bands from space, as planned for ESA's Euclid mission (scheduled launch in the final quarter of 
2019) and the NASA WFIRST mission (whose likely launch date is now closer to ~ 2024 because 
of cost overruns on JWST). Euclid will be a 1.2 m telescope with a 0.5 deg^ focal plane that will 
survey 15,000 deg^ in a parallel WL+BAO mode, with shape measurements performed in a broad 
red band (0.55-0.92 //m)l^ It will resolve many more galaxies than LSST, but the survey is not as 
deep and will have to be supplemented with optical photometry from the ground to obtain photo- 
zs. Euclid will have only 3-4 observations of each galaxy, but this is predicted to be acceptable 
given the much greater stability of Euclids PSF relative to anything possible on the ground. At 
~ 35 galaxies per arcmin^, Euclid would m easure shapes for ne arly 2 billion galaxies. WFIRST, at 
least in its current reference configuration (jOreen et al.l . lioill ) . would be a 1.3 m off- axis infrared 
space telescope capable of surveying 2840 degVyr in a parallel WL+BAO mode. The baseline WL 
program has a much higher degree of data redundancy built in than Euclid (10 observations, 5 in 
each of two filters centered at 1.4 and 1.8 ^m) but this (and the fact that the telescope must be 
shared with other science programs) comes at the expense of what is likely a smaller survey. Given 
that the start of WFIRST is at least several years in the future, we anticipate that several more 
design cycles and efforts to improve detector technology will be undertaken before the hardware 
complement is frozen. Both Euclid and WFIRST plan to acquire near-IR photometry in 3 bands 

— Euclid via a separate near-IR instrument, and WFIRST using the same data for WlI^ 

By the end of the 2020s, we should have a rich data set from all three of these projects (LSST, 
Euclid, and WFIRST) — and perhaps also from a large-scale radio interferometer such as the SKA. 
These surveys represent very different approaches to the Stage IV WL problem and will provide 
for multiple cross-checks of final results and internal cross-correlations of different data sets. The 
total number of galaxies with accurately measured shapes will probably reach ~ 4 billion, with 
most observed by at least two instruments and some with all three. Robust measurements of the 
suppression of the growth of structure to (TA7 ~ 0.03 — a factor of several better than Stage III 

— should then be possible (see Table [8]), as well as tests of other possible deviations from ACDM 
that we have not yet imagined. But a great deal of work will be necessary before then to ensure 
that the systematic errors are controlled at this level. 

Is there a future for WL beyond Stage IV, both in terms of science motivation and technical 
capability? It seems unlikely that there would be a follow-on experiment that consists of simply a 
super-size LSST, Euclid, or WFIRST, particularly given that these experiments will come within 
a factor of a few of the cosmic variance limit at several tens of galaxies per arcmin'^. Rather the 
more distant future would have to involve new technology and a new science case not subject to 



In order to compute color corrections, a small subset of the galaxies will be observed through a second filter 
mounted over 2 of Euclid's 36 CCDs. 

^^The WFIRST 1.1 fim band does not have enough dithers to recover full sampling and from the WL perspective 
is only for photo-zs. 
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the usual limitations. An example might be to l ook for lensing by prim ordial gravitational waves, 
which is not practical using galaxies as sources (jPodelson et all 120031 ) but is at least in principle 
possible us ing highly - redsh ifted 21 cm radiation as the source, even for tensor-to-scalar ratios as low 
as 10~^ f|Book et all boill ). But we have now entered the speculative realm of post-2030 science 
and technology, where our ability to forecast the future is of limited reliability. We thus conclude 
our discussion of weak lensing here. 
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6. Clusters of Galaxies 



6.1. General Principles 

Galaxy clusters have a long and storied history as cosmological probes. They provided the 
first line of evidence for the existence of dark matter ( Zwicky . 1933 : Smith . 19361 ). and cluster 
mass-to-light ratio measurements suggested that the niatter d ensity in the universe was sub-critical 
(rim < 1) as far back as the early 1970's (see lGott et al.l . ll974l . and references the rein) . The ev idence 



for low r?m, w as substantially strengthened by baryon fraction measurements ([White et al. I. fl993 : 
Evrard . 1997), a nd by the discovery of mas sive clusters at high [z ~ 0.8) redshift (e.g., iHenryl . 
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TlT Eke et al. . 1998 : Donahue et al. . 19981 ). Today, clusters remain an important cosmological 
tool, capable of testing cosmology in a variety of ways. Here we focus on cluster abundances 
as a tool for constraining the growth of structure in the matter distribution. Tight geometrical 
constraints from BAO and supernovae in turn yield tight predictions for structure growth assuming 
GR to be correct. Deviations from these predictions, revealed by weak lensing or by clusters, would 
constitute direct evide n ce for modified gravity as the driver of accelerated expansion. The excellent 
review by Allen et al.l ( 2011 ) discusses other cosmological applications of cluster s and examines 
recent cluster abundance results in detail (see also the earlier review bv lVoitlliooi l: we summarize 
recent work in §6.21 but devote most of our attention to methods for Stage III and Stage IV cluster 
surveys. 

The basic idea of cluster abundance studies is to compare the predicted space density of massive 
halos (Figure [T|) to the observed space density of clusters, which can be identified via optical, X- 
ray, or CMB observables that should correlate with halo mass. In optical searches, the basic 
observable is the richness, the number of galaxies in a specified luminosity and color range within a 
fiducial radius (typically taken to be the estimated virial radius of the halo). In X-ray searches, the 
luminosity Lx, temperature Tx, and inferred gas mass Mgas all provide observable indicators of 
halo mass. In CMB searches, clusters can be characterized by the central or integrated value of the 
fiux decrement Ysz produced by the Sunyaev-Zel'dovich (1970; hereafter SZ) effect: Compton up- 
scattering of CMB photons by hot electrons in the intracluster medium. The product Yx = TxM^^s 
defines an X-ray observable that should scale with IsZ) and numeri cal simulations predict that Yx 



tracks halo mass more closely than temperature or gas m ass alone (iKravtsov et al 



200611 



The first applications of this approach were made bv lPeebles et al.l k^?>i \ and lEvrardI (|l989l l. 
who used observed clust er abundances to argue against an Vlrn = 1 CDM cosmological model (see 
also iKaiseii Il986bl . Il99ll . who compared the observed evolution of X-ray clusters to predictions of 



a self-similar model with = !)• Halo abundance is sensitive to the amplitude of the matter 
power-spectrum ag and the matter density Q.m- The mean matter content in a sphere of comoving 
radius 8/i~^ Mpc is ~ 2 x 10^^ Af©- Thus, cluster-mass halos form from the gravitational collapse 
of fluctuations on about this scale, and their abundance naturally tracks a%. Moreover, because 
the total mass of each collapsed volume scales linearly with the number of halos at a given 
mass can be raised either by raising cxg, so that fluctuations are larger, or by raising ^m-, so that 
the mass associated with each perturbation is larger. The quantity m ost tightly constrai ned by 



cluster abundances is a combination of the form (JgOm, with q ^ 0.4 dwhite et al.l . Il993l l. The 



degeneracy between ag and Qm can be broken by measuring abundances at a variety of masses. 
This argument also holds at higher redshift, so one can think of cluster abundances as primarily 
constraining as{z)i}m, modulated by the additional cosmological dependence of the volume element 
dVc{z) oc D\H~^dU,dz, and by any intrinsic dependence of cluster observables on the distance- 
redshift relations. Note that, as elsewhere in this article, Vtm always refers to the z = Q value unless 
^rn{z) is written explicitly. 
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Figure 18 (a) Cumulative halo counts as a function of limiting mass for a lO'* deg^ survey in a 
redshift slice z = 0.4 it 0.05. The solid line shows the fiducial model from Table [TJ The dashed line 
corresponds to w = —0.8 with the amplitude of the primordial matter power spectrum held fixed. 
The dotted line has w = —0.8, but holds 178(2; = 0.4) fixed. Residuals relative to the fiducial model 
are shown in the bottom panel. The small, nearly constant offset of the dotted line is sourced by 
the dark energy dependence of the comoving volume element dVc- (b) The significance with which 
this hypothetical halo sample could distinguish the fiducial model from the alternatives in panel 
(a) as a function of mass threshold, using the statistical error of equation (jl39p . The dot-dashed 
line shows an additional model in which as{z = 0) is held fixed. Even though the high mass end 
of the halo mass function depends most strongly on cosmology, the statistical power of the cluster 
abundances is dominated by the low mass end because of the much lower measurement errors. 



We illustrate these ideas in Figure [THJ Panel (a) shows the expected halo abundance as a 
function of the limiting mass in a redshift slice z = 0.35 — 0.45 subtending 10^ deg^. Plots for 
other redshift slices are qualitatively similar. For this plot, and throughout the rest of this section 
unless otherwise noted, halo mass refers to the mass enclosed within a sphere whose mean interior 
overdensity is A = 200 relative to the mean matter density of the universe. The solid line is the 
abundance in our fiducial model (see Table [T]) , while the dashed line shows the corresponding halo 
abundance when setting w = —0.8 and holding and the primordial power spectrum amplitude 
As{k = 0.002 Mpc~^) fixed. Unlike in Figured! this choice does not leave the CMB observables 
fixed, but it better illustrates the intrinsic sensitivity of cluster abundances. For w = —0.8, dark 
energy becomes dynamically important earlier than for w = —1, suppressing growth and lowering 
(Ts{z = 0.4) from 0.66 to 0.62. This sharply reduces the halo abundance, by ~ 30% at a threshold 
of W^^Mq and by ^ 60% at W^^Mq. If we raise Ag so as to hold asiz = 0.4) fixed, then the 
w = —1 and w = —0.8 models differ by a nearly constant factor of 1.1, which is the ratio of the 
comoving volumes of the redshift slices in the two cases. This volume effect is clearly weaker than 
the overall scaling of halo abundances with ug. 

While the mean halo abundance becomes more sensitive to crs{z) at higher masses, the statistical 
precision with which one can measure cts{z) decreases with increasing mass because of the larger 
Poisson fluctuations for rarer clusters. This point is illustrated in Figure 118b . which shows the 
statistical significance at which a lO'^deg^, z = 0.35 — 0.45 cluster survey would distinguish the 
models shown in panel (a). For reference, we also show the case in which cg is held fixed at z = 0, 
which reduces model differences because the growth and volume element effects act in opposite 
directions. We discuss statistical errors in cluster abundances, including the role of sample variance. 
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in ^6.3.11 The key conclusion from Figure 118b is that lower mass clusters allow stronger model 
discrimination. 

Cluster cosmology requires making an explicit link between the theoretically predicted pop- 
ulation of halos as a function of mass and an observed population of clusters. This problem is 
complicated by the fact that the halo population is usually characterized using dark matter simu- 
lations, whereas clusters are identified using baryonically-sourced signatures such as the presence 
of galaxy overdensities, extended X-ray emission, or SZ decrements (see ^6.3.2p . The lower mass 
limit probed by cluster abundance experiments is partly set by the detection thresholds intrinsic 
to each method, but also by the difficulty of characterizing the relation between low mass halos 
and poor clusters. Different researchers adopt varying definitions of halos and of clusters. Within a 
reasonable range, such variation is acceptable, provided each study is self-consistent and the halo- 
cluster relation is accurately characterized. In recent years, nu merical studies have in ostly shifted 
from the friends-of- friends a lgorithm used in e arlier work (e.g., Efstathiou et al. 19881 ) to spherical 
overdensity definitions (e.g.. Tinker et al. 20081 ). thus avoiding the tendency of th e friends-of-friends 



method to occasionally link distinct mass concentrations via narrow bridges (see lMore et al.l . 12011 



and references therein for a more detailed discussion). Halo boundaries are typically drawn at 
overdensities A ~ 100 — 500, where clusters are in approximate dynamical equilibrium and where 
mass predictions are fairly robust to baryonic physics. The overdensity A can be quoted relative 
to the mean matter density of the universe at the cluster redshift or relative to the critical density 
at that redshift. In this section, we will adopt A = 200 with respect to the mean density as our 
definition unless otherwise specified. 

The principal challenge to precision cosmology with clusters is not cluster identification per 
se, but the accurate calibration of the relation between cluster observables (e.g. richness. X-ray 
luminosity, SZ decrement) and halo masses. Figure [T9l illustrates this point by flipping the x and 
y axes of panel (a) in Figure \T8\ thus plotting the mass threshold at fixed cluster abundance for 
the different cosmological models. Changing from w = —1 to w = —0.8 while holding Ag fixed 
changed the predicted abundances by 30 — 60%, but the corresponding change in mass threshold is 
only about 20%. For fixed crs(z = 0.4), the 15% change in abundance corresponds to a 2.5% — 6% 
change in mass threshold. These, then, are the levels of accuracy in mass calibration that must be 
attained to distinguish between the two w = —0.8 models and our fiducial w = —1 model. The 
issue of mass calibration will arise repeatedly in this section, especially in ^6.3.31 and ^6.4.3i 

In principle, cluster abundances are sensitive to crQ{z), O^, and the comoving volume element 
dVc, as well as any inherent sensitivity of the relation between cluster mass and cluster observ- 
ables on the distance-redshift relations. To simplify our discussion, we will usually assume that a 
combination of other data sets (CMB, SN, BAG, WL, etc.) will determine both and dVc{z) 
at higher precision than that achievable from cluster abundances. Consequently, we will focus 
on the sensitivity of cluster abundances to crs{z) while holding ^1,^, dVc{z), and the angular and 
luminosity distances fixed. In practice, we expect our assumption should be a good one as far as 
the comoving volume element and the distances are concerned. However, the sensitivity attainable 
with clusters is high enough that holding fixed may be incorrect in detail. We will discuss this 
point in §6.61 and again in §8.4[ 

Many cluster cosmology papers quote masses in Mq because observational mass estimates 
(and, to some extent, theoretical predictions) scale inversely with h. However, at non-zero redshift 
many other parameters also come into play, and h is itself one of the parameters constrained by 
dark energy experiments. Thus, we have elected to quote masses in Mq rather than h^^ Mq. 
In a similar vein, we will switch most of our subsequent discussion from erg to cJn ^bs) the rms 
fluctuation on a scale of i? = 11 Mpc (equal to erg for h = 0.727). For some observables (e.g., 
the X-ray estimated gas mass -Mgas, the inferred cluster mass is sensitive to the angular diameter 
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Figure 19 Halo mass thresholds as a function of cumulative number counts, i.e. flipping the x and 
y axes of Figure \TEk . The x-axis shows the number of halos predicted in a 10^ deg^ survey in a 
redshift slice z = 0.4 it 0.05. The lower panel shows the fractional change in mass threshold relative 
to the fiducial cosmological model. 



distance Da( z) and this depen dence itself provides useful cosmological constraints; this point is 
discussed by I Allen et al.l (120111 ) but we will not address it further here. Our primary focus is 
the statistical precision with which cluster abundances constrain o"ii,abs(-2), and the level at which 
systematic uncertainties must be controlled to achieve these statistical limits. In §6.61 we compare 
the precision potentially attainable with clusters to forecasts (described in ^ from fiducial Stage 
III and Stage IV CMB+SN+BAO+WL programs. 



6. 2. The Current State of Play 

Most cluster cosmology studies of the past decade have been based on X-ray catalogs, with 
typical cluster samples numbering in the several tens to few hundreds of clusters. The vast ma- 
jority of these cat alogs rely on ROSAT data — either from the ROSAT All-Sky Survey (RASS; 
Voges et al. . 19991 ) or from serendipitous detections in pointed observations — though there are 
also samples selected based on XMM-Newton and Chandra imaging. Table H] summarizes some of 
the main X-ray catalogs that have been employed in these studies. The recently approved XXL 
survey will add ~ 50 deg'^ of imaging, contributing ~ 600 clusters out to z = 1 and above. The 
next big step forward for X-ray samples is the eROSITA mission, which should identify ~ 80,000 
galaxy clusters at high confidence (see §6.5p . 

The largest existing cluster samples are optically selected, using either spectroscopic or photo- 



They tend to be shallow, with typical z < 0.2 ( 


Merchan and Zandivarez. 


2003: 


Kochanek et al.. 


2003: Miller et al. 


. 2005: Merchan and Zandivarez. 


2005: Berlind et al. 20061: 


Yangr et al.. 2007: 


Li and Yeel. 


200d: 


Blackburne and KochaneH. 


2011) 


though high redshift spectroscopic catalogs 
)tometric cluster catalogs hail back as far as 
upwards of 2500 systems and served as the 


do exist (Gf 
the original 


rke et al.'. '20051: Coil et al.. 
Abell (195a) catalog, whicf 


20061). Phc 

1 contained 



primary basis of cluster studies for decades. Tho ugh many recent photometric catalogs have fo- 
cused on narrow but d e ep survey data (z < 1, e.g., Gonzalez et al. . 2001 : Gladders and Yee . 20oE; 



. ■■ ^ ^^^r- — ■ , — — \ — -7 — 0-7 I — " 17 I 17 I . 17 i 1 

Milkeraitis et al.l . l2010l : lAdami et al.l . |2010| ). the SDSS has led to the publication of several mod 



erately deep {z < 0.5) and wide catalogs, which can contain upwards of 50,000 clusters (e.g. 
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Table 4. X-ray Cluster Catalogs 



Catalog/Reference Type of Survey No. of Clusters Redshift Limit 



BCS fEbeline: et al. 2000) 


Wide/Shallow 


107 


0.3 


NORAS fBohrineer et al.. 2000) 


Wide/Shallow 


378 


0.3 


HIFLUCCS fReiDrich and B51irinffer. 2002) 


Wide/Shallow 


63 


0.2 


WARPS fPerlman et al.. 2002) 


Narrow/Deep 


34 


0.8 


SHAEC rBurke et al.. 2003) 


Narrow /Deep 


48 


0.7 


160 des^ fMullis et al.. 2003) 


Narrow/Deep 


201 


0.7 


REFLEX fBShrinsrer et al.. 2004) 


Wide/Shallow 


447 


0.3 


400 deff^ fBurenin et al. 2007) 


Narrow/Deep 


287 


0.8 


MACS fEbeline et al. 2010) 


Wide/Shallow 


34 


0.6 


MCXC CPiffaretti et al.. 2010) 


Compilation 


1783 


0.8 


XCS fLlovd-Davies et al.. 2010) 


Narrow/Deep 


1022/3669* 


0.8 



Note. — All cluster catalogs included above are drawn from 06*74 T data, except for XC S, which 



is a serendipitous cluster search in XMM-Newton archival data (see iMehrtens et al.l . 1201 ll . for the 
first data release). Wide/shallow survey catalogs refer to cluster searches in the ROSAT All-Sky 
Survey (RASS), whereas narrow/deep catalogs are drawn from pointed ROSAT or XMM-Newton 
observations. MCXC is a compilation of various X-ray cluster catalogs. The characteristic high 
redshift limit shown is not the redshift of the highest redshift cluster in the sample, but rather a 
redshift that contains > 90% of the galaxy clusters. The highest cluster redshifts can be significantly 
higher than the redshift quoted, as expected for flux limited surveys. 

*1022 is the number of galaxy clusters with > 300 photons, allowing for Tx estimates. 3669 is the 
number of 4(7 cluster candidates. 
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Koester et al.. 


2007: 


Wen et al.. 


200g|; 


Hao et al.. 


20ld; 


Szabo et al.. 



out to 2; ~ 1 over 1000 deg or more from current or near future photometric surveys — such as 
RCS-2, DES, Pan-STARRS, and HSC — will expand samples to the hundreds of thousands. 

One limiting factor that affects these optical cluster finding experiments is that the 4000 A 
break in the spectrum of early- type galaxies shifts into the near-IR at 2: w 1, making optical 
detection challenging above this redshift. This difficulty can be overcome with IR adaptations 
of optical cluster finding techniques. Today, there are two independent effo rts aiming to de- 



tect galaxy clusters using IR data: the IRAC Shallow Cluster Survey (JSCS; lEisenhardt et al 



2008 ') and the Spitzer Adaptation of the Red-Sequence Cluster Survey (SpARCS; 



Wilson et al 



2006 ). Both surveys have di scovered and spectro s copically confirnied ca.ndidate galaxy c l usters 



out to redshi f t z 1.5 (e.s.. Ista nford e t all boosi: iBrodwin et al.l . I2OO6I : lEisenhardt et all . I2OO8I : 



Muzzin et al.l. l2009l: IWilson et al.i . 2009; Demarco et al.l . I2OIOI ). some of which have detected X-ray 



emission (Brodwin et al. 
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3)7Both 



surveys are also expected to detect candidates at z w 1.5 — 2, 
though spectroscopic confirmation of such systems will be observationally challenging. These early 
results are encouraging and suggest that IR detection of high redshift clusters can play an important 
role in the future of cluster cosmology. 



W hile detections of the SZ effect in known galaxy clusters date back as early as 1976 (iGull and Northoved . 
19761 '! ■ it is only recently that instrumentation advances have made large scale SZ searches feasible. 
The first three successful cluster SZ surveys — using the South Pole Telescope (SPT), the Ata- 
cama Cosmology Telescope (ACT), and the P/anc/c satellite — are all currently ongoing. Al l three 



projects have released SZ-sel e cted cluster samp l es (IVanderlinde et al.l . l20ld : iMarriage et al.l . 12011 



Planck Collaboration! , boilbl : I Williamson et al.l . I2OIII ). These samples tend to be of very massive 
clusters (see Figure [25]) and, in the case of ACT and SPT, extend to z ~ 2, with the upper limit set 
by the lack of massive galaxy clusters above this redshift. For ACT and SPT, this redshift coverage 
is limited only by the abundance of such massive objects at high redshift. Planck is limited in 
part by its relatively large beam, but it has the important benefit of being an all sky survey, which 
results in a larger cluster yield overall. Based on the sensitivity estimates shown in Figure [25] below, 
we anticipate ~ 700 clusters in 2500 deg^ for SPT and ~ 11,000 over the full sky for Planck. We 
emphasize, however, that these numbers can easily shift by factors of ~ 2 — 3 depending on the 
signal-to-noise cut adopted for cluster identification. In contrast to optical and X-ray techniques, 
there is not likely to be a major leap forward in SZ capabilities in the next few years, so the SPT, 
ACT, and Planck samples will probably remain the largest SZ cluster samples available for the 
next decade. That said, the limiting masses of SZ cluster samples will go down as these and other 

facilities conduct deeper surveys focused on CMB polarization (e.g., ACTPol and SPTPol)j 

E xisting cluster cosmology constraints have corne primarily from X-ray data (see, e.g . , 'Henrvl, 

2000l : lReiDrich and Bohringerl . l2002l : [Schuecker et"aD . l2003l : lAllen et al.l . [2003l : IPierpaoh et al 200a ). 



reflecting the fact that X-ray observables can be related to mass via simulations and/or analytic 
approximations and by hydrostatic modeling for well observed clusters. All three of the most re- 
cent X-ray analyses yielded tight, consistent cosmolo g ical constraints, which can be summarize d 



as o-8(O™/0.25)°-45 = 0.80 ± 0.03 (jHenrv et al. l. I2OO9I : IVikhlinin et"al] . I2OO9I : iMantz et all , hoi& i. 



Cosmological analyses from optical samples have typically been less constraining because of un - 



certain mass calibration (see, e.g., iBahcall et all . l2003l : I Gladder s et al.l . l2007l : IWen et al.l. I2OI0I') 



However, recent work tha t uses s tacked weak le nsing analysis for mass calibration (jJohnston et al. 



2OO7I : iMandelbaum et all . I2OO8I : ISheldon et all, liooal has allowed optical samples to achieve the 



same level of precision as X-ray samples ( Rozo et al. . 20ld), with comparable levels of systematic 
error . Constraints from SZ selected samples are emerging (|Vanderlinde et al.l . l20ld : ISehgal et al.l 



20 111 ), and while they are currently weak because of the relatively large uncertainty in the SZ-mass 



scaling relation, the extensive follow-up campaigns that are currently underway will reduce this 
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Figure 20 Comparison of the 68% confidence regions derived from galaxy cluster abundances and 
WM AP CMB data by v a rious groups. The f irst th ree error ellipse s — u sing quoted uncertainties 
from iMantz et al.l (|20ld ). iHenrv et al.1 hoO^). an d IVikhlinin et al.l (|2009l ) — all come from X-ray 
selected cluster samples. The Rozo et al. ( 20ld ) ellipse comes from a. n optic ally selected cluster 
sample with stacked weak lensing mass calibration. The [Tinker et al.l (|2012l ) constraint uses the 
same optical clusters and mass calibration, but but relies on galaxy clustering and mass-to-number 
ratios to derive cos mological constraints, making it essentially an independent cross-check. The 
Benson et all (j201lh ellipse comes from the SPT selected cluster sample. 



scaling uncertainty and bring these constraints to a level comparable to those from optical and 
X-ray cluster catalogs. 

Regardless of the wavelength of choice, current cluster abundance constraints are limited not 
by the number of clusters but by uncertainty in mass calibration. Figure [20] shows the cluster 
abundance constraints from several recent analyses. Because the current X-ray and optical mass 
calibrations are fundamentally different (hydrostatic vs. weak lensing), the excellent agreement 
illu strated in Figure 1201 pr o vides a strong test of systematic uncertainties. However, the results from 
the iPlanck Collaboration! (j2011al ) have s ounded a cautiona ry note, as the optical mass estimates 
used to d erive cosmological p arameters in Rozo et al. ( 20ld ) appear to be inconsistent with SZ data 
(see also Draper et al. . 201 ll ). Rozo et al. (2012, in prep) argue that there is no real tension and 
point towards an alternative interpretation of the iPlanck CollaborationI (j2011al ) results. Regardless 
of how this issue is ultimately resolved, it is clear that further tightening cosmological constraints 
will require a significant improvement in our ability to estimate cluster masses. 

On this last count, we note that Figure [20] also includes cosmolog ical constrain t s froin an analysis 
byEinteLeLaD (|2012l ) that does not rely on cluster abundances. [Tinker et al.l (l2012[ ) use a halo 
occupation model (see ^2.31) f it to SDSS galaxy clustering, which yields a prediction for the mass- 
to- number ratio of clusterq£f[ as a function of dg and Qm- While this analysis uses the same weak 



Analogous to mass-to-light ratio, but with galaxy number instead of integrated luminosity. 
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lensina; mass calibration as IRozo et alJ (|2O10l '). the method is less sensitive to the mass scale and 



entirely independent of abundance uncertainties, making it a largely independent measurement 
and a powerful systematics cross-check. The same approach can be adapted to future, deeper 
photometric surveys. 

6.3. Observational Considerations 

6.3.1. Expected Numbers and Cosmological Sensitivity 

Figure [2Tb shows the expected cluster counts in our fiducial cosmological model for a variety of 
limiting masses, as a function of the limiting redshift z of a 10^ deg^ survey. (Note that these are 
lower limits on mass but upper limits on redshift.) Panel (b) shows number counts in redshift bins 
of width ±0.05; e.g., at z = 0.15, we show the halo counts in the redshift bin [0.1, 0.2]. We maintain 
this redshift binning convention throughout. Together, these two figures give a broad-brush sense 
for the typical sample sizes and redshift distribution of galaxy clusters as a function of limiting 
mass and redshift. 

Assuming halo masses can be adequately measured, the sta tistical error in cluster a bundances 
is the sum in quadrature of Poisson noise and sample variance (|Hu and Kravtsovl . boO^l ) . 

{/\Nf = N + b'^N'^a^{V). (139) 

Here, N is the mean number of halos in the volume of interest, b is the mean bias of the halos, 
and o"^(y) is the variance of the matter density field over the survey volume 1^ Figure [2Tb shows 
the fractional error AN/N for the fiducial model, again for redshift bins z = Zc ^ 0.05 where Zc 
is the central redshift of the bin. Sample variance becomes larger than Poisson variance below a 
transition mass ~ 4 x 10^^ Mq at 2; = 0.1 and ~ 10^^ Mq at z = 1. However, the statistical error 
is never more than a factor ~ 2 above the iV~^/^ Poisson expectation (see Figure [24] below) , and 
total statistical errors should scale with survey area roug hly as (^/10'^deg2)-i/2_ Yot any mass 
threshold the statistical error first decreases with redshift, as the number of clusters grows with the 
increasing comoving volume per Az. This trend flattens when the clusters become exponentially 
rare, at which point further increase in redshift leads to a precipitous drop in the number of clusters 
and a corresponding rise in Poisson errors. These competing effects lead to the characteristic U- 
shape of the curves in Figure [2Tb . 

Figure \TDi converts these statistical abundance errors to equivalent errors in mass by dividing 
AN/N by the logarithmic slope of the cumulative halo mass function, a = —dlnN/dlnM, which 
ranges between 2 and 5 depending on redshift and mass. While observational samples are not 
thresholded exactly in mass, the sensitivity of cluster abundances to an overall shift in the mean 
mass at fixed observable is well captured by this heuristic argument. In order for clusters to saturate 
the statistical limit in the abundances, the uncertainty in mass calibration must be smaller than 
this AM/M. For a 10"^ deg^ survey and M > 8 x 10^^ Mq, a mass accuracy of 3% — 10% (depending 
on z) suffices. By M ~ 2 x 10^^ Mq, however, the accuracy requi rement has sharperied to < 1%. 
(This last number agrees well with the more detailed analysis of Cunha and Evrard 20ld ] for a 



mass threshold of lO^^-^ Mq; see in particular the top panels in their Figure 2.) Achieving such 
accuracy is a tall order, and current studies are clearly limited by the systematic uncertainty in 
cluster masses rather than abundance statistics. Note that the required accuracy scales roughly as 
(^/lO^ deg^)~^/^, and it applies to the overall mass scale (i.e., the mean of the mass-observable 
relation) rather than the mass of any individual system. 



^*For example, in our fiducial cosmology at z = 0.6, the matter variance in a volume of Az — 0.1 and area 10,000 
deg^ is a{V) 0.2%, and the mean halo bias is ~ 3.0 and ~ 5.7 for mass thresholds of lO^A^o and 4 x 10"Mq, 
respectively. 
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Figure 21 (a) Cumulative halo number counts as a function of the hmiting redshift for a variety 
of mass thresholds. We assume the fiducial cosmological model from Table [H and survey area of 
10^ deg^. (b) Counts in redshift bins z = Zc i 0.05. (c) Statistical error in the number of clusters 
above a given limiting mass from equation ()139p . again in redshift bins z = Zci: 0.05. (d) The mass 
accuracy required to ensure that cosmological constraints are limited by the statistical precision in 
the number of galaxy clusters rather than by uncertainties in mass estimation. 

Figure [21] translates the errors on cluster abundance from Figure [21] to errors on the matter 
power spectrum amplitude un^absl-z), again for a 10^ deg^ survey with z = Zc i: 0.05 bins. For 
simplicity, we assume that Qm, the comoving volume element dVc{z), and the power spectrum shape 
are perfectly known from independent data (CMB+SN+BAO+WL), so that crii,abs(-2) is the single 
cosmological parameter controlling the cluster abundance. As discussed in ^6.11 if the uncertainty 
in is non-negligible, then it is the combination cjg(z)J7m that is constrained instead. Panel 
(a) shows the case where mass calibration errors are negligible. The errors on c7ii^abs(-2) roughly 
track the abundance errors AN/N in Figure [2T| but because the sensitivity of the abundance to 
cii,abs('Z) at fixed mass increases with increasing redshift, the best constraint on cTii_abs('<^) comes 
at a higher redshift than the one at which AN/N is minimized. The remaining panels show the 
impact of 1%, 2%, 4%, and 8% mass calibration errors for three different threshold masses. 

The basic features in Figure [22] are simple to understand at a quantitative level, starting from 
the knowledge that cluster abundances constrain the combination cTii_abs(-2)f^m with q « 0.4. Since 
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Figure 22 (a) Statistical error on cru^absl-z) as a function of redshift, in redshift bins z = Zc =b 0.05, 
for different mass thresholds as labeled. We assume a 10^ deg^ survey area, and the fiducial 
cosmological model. We also assume that Qfn, the shape of the matter power spectrum, and the 
comoving volume element dVc are perfectly known from independent data (CMB+SN+BAO+WL). 
Panels (b)-(d) refer to specific mass thresholds as labeled. In each panel the solid curves show 
the effect of different mass calibration uncertainties as labeled while the dotted curve assume the 
perfect mass calibration values (i.e., number statistics limited) from panel (a). For reference, the 
uncertainty in (Tii^abs(-2) that we forecast for a fiducial CMB+SN+BAO+WL program is ~ 1% for 
Stage IV data sets and ~ 2 — 3% for Stage III data sets (see ^6.61 and ^8.4p . 

the mass of a collapsed volume scales linearly with ri^, a shift of the mass scale by a constant 
factor is nearly degenerate with a change of by the same factor. Together these scalings imply 
(Tii^abs(-2) oc M*^, where M is the mass scale at fixed abundance, making A ln(Tii^abs(-2) ~ qAlnM 
for a survey limited by mass calibration uncertainty A In M. For a survey limited by halo statistics, 
the corresponding effective mass error is (AlnM)eff = a"^AlniV wherea = -din N/ din M ^2-5 
is the slope of the cumulative halo mass function, so in this case Aln(Tii^abs('2) ~ qa~^^AlnN. 
Combining the two limits we arrive at 

Alncrii,abs(z) ~ ^ X max [AlnM, a^^AlniV] . (140) 

The above expression fits the data in Figure [22] with better than 30% accuracy (typically < 15%). 
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Figure 23 The degeneracy exponent g as a function of redshift for a series of threshold masses. The 
parameter q is the exponent in cth abs(-2)f^m that holds the abundance of galaxy clusters above the 
quoted threshold mass at the appropriate redshift bin fixed for small, oppositely directed changes 
in <Tii,abs(^) and fi^. 



Figure [23] plots the value of the degeneracy exponent g as a function of limiting mass and 
redshift. In the Press-Schechter ( 19741 ) theory of the halo mass function, the cumulative abundance 
is set by the probability that a point in a Gaussian field of variance cj^(M) exceeds the critical 
threshold Sc ~ 1-69 for spherical collapse (see §2.3p . so that cx [l — erf(5c/\/2<7(M))] . Putting 
in the cr(M, z) relation for a ACDM power spectrum yields a logarithmic derivative dlnN/d\na = 
do- ~ 5 — 9 depending on mass and redshift. Because cluster abundances are degenerate in Qm/M, 
the logarithmic derivative of cluster abundances relative to 0,m is the same as the slope a of the 
mass function (but with opposite sign), so locally the cumulative mass function scales as 

Nim) oc [aii^^UzT^ = [^iiM^)^^"''"^] • (141) 
We see that halo abundances are degenerate in a^^^p,h!^{z)Q s m wit h q = —a/a^j ^ 3/7 « 0.4. We 



plot the ratio a/a^ — computed using the Tinker et al. ( 20081 ) mass function rather than the 
Press-Schechter mass function — in Figure [ 



A cluster abundance analysis becomes limited by mass scale uncertainty rather than halo abun- 
dance statistics when AlnM > a^-'^AlnA^. If we approximate the error as Poisson, AlnA'^ = 
N~'^/'^, then an experiment is limited by mass uncertainty if the sample size is > (aAlnM)"^. 
Current systematic uncertainties in mass calibration are ~ 10%, which for a ~ 3 corresponds to 
N ~ 10. Thus, cluster abundance studies are limited by uncertainty in the overall mass scale 
for samples with as few as ~ 10 — 20 galaxy clusters. For cluster samples with N ~ 10'^ (10^); 
the accuracy required in mass estimation for an experiment to be dominated by halo statistics is 
~ 1% (0.3%). So that one may apply the rule-of-thumb estimates derived in this section. Figure 
[2^ plots the mass- function slope a and the ratio of the total error A In to the Poisson uncer- 
tainty N~^/'^ . Note that abundance errors including sample variance almost never exceed twice the 
Poisson error and are often much closer. Using Figures [23] and [2l] along with equation ()140p . one 
can quickly estimate how well an experiment with given number of galaxy clusters can constrain 
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Figure 24 Left: Logarithmic derivative a = —dlnN/dlnM of the cumulative halo counts, as a 
function of redshift, for five mass thresholds as labeled. Right: The ratio of the total (Poisson + 
sample variance) error in the halo counts AlnA^ to the Poisson error A^~^/^. Solid lines assume a 
survey area of 10,000 deg^, while dashed lines correspond to 100 deg'^. In conjunction with Fig. 
and equation (|140p . these figures allow one to quickly estimate how well o"ii^abs(-2) can be constrained 
at each redshift by a galaxy cluster sample with clusters. 

'7ll,abs(^)- 

If and dVc{z) are not perfectly known, then cluster abundances will constrain a combination 
of cosmological parameters rather than the matter fluctuation amplitude alone. Predicted abun- 
dances are proportional to dVc{z), so for an experiment dominated by uncertainty in the mass scale, 
uncertainty in the volume element will affect the interpretation if AlndV^ ^ aAlnM, the effec- 
tive abundance uncertainty. SN and BAO surveys should typically yield uncertainties below this 
limit, so we expect regarding dVc{z) as known to be an adequate approximation for our purposes. 
Since a pure shift in is equivalent to a shift in mass scale, uncertainties in fi^ are relevant if 
Alnilm > AlnM, where we have again assumed the experiment in question is dominated by the 
mass error AlnM. If the uncertainty in is larger than this critical scale, then clusters will 
effectively constrain o"ii^abs(^)^m rather than (Th abs(-2) alone. Equation (|140p will still hold, but 
one must replace Alnan^absl-^) by A In [(Tii^abs(-z)f^m]- Current fractional uncertainties in Qm are 
~ 10%, comparable to mass calibration systematics. Future studies will reduce Qm uncertainties, 
but they may remain significant compared to improved mass calibration errors in cluster surveys. 

We have focused our discussion here on cumulative cluster abundances — i.e., space densities of 
clusters above a mass threshold — while observational analyses usually examine the differential dis- 
tribution as a function of observable mass-proxies. Differential distributions are useful for breaking 
degeneracies (e.g., among cru^abs, and dVc), and for constraining "nuisance parameters" such as 
the scatter of the observable-mass relation. However, for single-parameter constraints on cth absl-^), 
we expect that our analysis of the cumulative abundance uncertainties provides an accurate guide, 
as it makes use of the single number best determined by the data for any given mass threshold and 
redshift range. 

6.3.2. Cluster Finding 

Each of the three main methods for finding galaxy clusters — optical. X-ray, and SZ — has 
its own virtues and deficiencies. The principal advantage of optical surveys is sheer statistics, 
reflecting the low mass threshold for optical detection; clusters with masses as low as 5 x 10^^ Mq 
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are capable of hosting significant galaxy over densities. Near-future surveys (RCS-2, DES, HSC, 
Pan-STARRS) should find ~ 10^ systems in areas of 10^ — 10^ deg^ out to z ~ 1. On a longer time 
scale (~ 10 years), surveys with LSST should increase the available cluster samples by another 
factor of 5 — 10, due both to larger area («20,000 deg2) and to deeper imaging, which should allow 
cluster detection out to z ~ 1.5. Finally, cluster searches in the IR are capable of finding galaxy 
clusters out to z ~ 2, but large survey areas to this depth will only be achievable with the advent 
of Euclid and/or WFIRST. With the stacked weak lensing mass calibration that we advocate in 
§6.3.3| the calibration accuracy scales with cluster number sas iV~^/^, so enormous samples are 
statistically advantageous even if mass uncertainties dominate the error budget. 

The main drawback for optical cluster detection is projection effects, i.e., chance alignments of 
multiple low mass halos along the line of sight that are misidentified as a single massive galaxy 
cluster. While this systematic has been drastically suppressed in modern surveys with multi-band 
photometry and photometric redshift estimators, one still expects 5% — 20% of photornetricall y 
selected clusters to suffer from serious projection effects ( Cohn et al. . 2007 : Rozo et al. . 2011al ). 



The importance of projection effects increases with decreasing mass, so we expect it is projection 
effects rather than survey depths that will ultimately set the detection mass threshold for optical 
cluster finding in future surveys. 

Unlike optical studies, X-ray cluster searches are nearly free from projection effects. This 
robustness to the presence of structures along the line of sight refiects the fact that X-ray emission 
scales as density-squared, which enhances the relative contrast of a cluster in the sky, and it is 
the principal reason that X-rays are considered the cleanest method for selecting galaxy clusters. 
The main difficulty for X-ray selection is a technological one, specifically, the need for space-based 
observatories. A dramatic leap forward in capabilities will happen with the launch of eROSITA, 
which should detect ~ 10^ galaxy clusters over the full sky out to z = 1 and beyond, ensuring that 
X-rays will continue to play a critical role in the development of cosmologically relevant cluster 
samples over the coming decade. On a longer time scale, further improvements would require 
X-ray observatories that reach lower flux limits with higher angular resolution, both of which are 
needed to detect large numbers of systems at z > 1. 

The primary advantage of SZ searches is that they do not suffer from cosmological dimming. The 
SZ signature arises from up-scattering of CMB photons by the hot intra-cluster plasma, and because 
the number of up-scattered photons does not depend on the distance to the cluster the signal is 
roughly redshift independent. In practice, the SZ signal is not exactly redshift independent because 
of residual sensitivity to the relative size of the cluster and the beam of the telescope. Unfortunately, 
achieving sufficient sensitivity to detect low mass clusters in SZ is technologically very challenging. 
For instance, the current SPT, ACT, and Planck surveys are expected to be com plete at all redshifts 



above mass thresholds of 7 x 10^^ M(7,, 10^^ Mf^,, and 2 x 10^^ M© respectively (jVanderlinde et al.l 



2010l : lMarriage et al.l . l201ll : IPlanck Collaborationl . l2011bh : while these limits will go down, they will 



not reach thresholds comparable to those of X-ray or optical cluster selection. Consequently, while 
these experiments are currently the best avenue to probe the z ~ 1 massive cluster population, on 
a 3 — 5 year time scale the focus of cluster detection is likely to shift towards optical and X-ray. To 
our knowledge, there are no current plans to develop a new generation of SZ survey instruments 
that would dramatically improve upon the capabilities of current experiments for cluster detection, 
at least compared to the differences in optical (e.g., DES vs. SDSS) and X-ray {eROSITA vs 
ROSAT). However, both SPTpol and ACTpol should lead to significantly lower mass thresholds 
for SZ cluster detection than the current SPT and ACT cluster samples. 

Figure [25] showcases the difference of the cluster populations from the various selection methods, 
where we have limited ourselves to wide surveys (1000 deg'^ or higher) and have shown only a 
handful of representative selection functions. The top row shows the selection functions for existing 
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or ongoing surveys, while the bottom-row shows the selection for future surveys. The left panels 
shows the limiting mass as a function of redshift for each of the surveys considered, while the right 
panels shows the number above the limiting mass in a redshift bin of width Az = 0.1, accounting for 
survey area. We emphasize that in practice cluster samples never have a sharp mass threshold; the 
curves shown in Figure [25] are only roughly indicative of the mass and redshift ranges probed. The 
number of clusters detected depends in detail on the selection cuts applied, and small changes in 
threshold translate to larger changes in abundance, so factor-of-two deviations from the projections 
in Figure [25] would not be particularly surprising. 

For the optical detection threshold we have assumed that p rojection effe c ts lini it useful cluster 
catalogs to a minimum richness A = 15 in the algorithm of [R^koffeiaD (i201ll ). which counts 
galaxies of luminosity L > 0.2L*. To account for mass-richness scatter, we choose an effective 
mass threshold that yields approximately the same space density as this richness threshold. The 
sharp upturn occurs when 0.2L* matches the magnitude limit of the survey. In SZ, we see that 
the SPT mass threshold (kindly provided by the SPT collaboration, and normalized to a total 
cluster yield of ~ 700 clusters at full depth) is only mildly sensitive to redshift. The gentle decrease 
in limiting mass with increasing z reflects the fact more distant clusters subtend smaller angles 
that better match the SPT beam size, and that clusters are hotter at fixed mass with increasing 
redshift. For Planck, conversely, the decreasing angular size of clusters reduces sensitivity at higher 
redshift bec ause the beam itself is large. The curve shown is a rough estimate of the Planck Early 
SZ sample ( Planck Collaboration . 201 Ibl ). though the final selection will go considerably lower in 
mass, because of both deeper data and lower S/N cuts. The SPTpol curve is similar to SPT, but 
it reaches lower i nasses over a smaller area, while the ACTpol curve reaches similar noise levels 
to SPT (ps 20/iK, iNiemack etHI . I2OI0 II over a larger area. (ACTpol also plans a separate survey, 
deeper and narrower than SPTpol.) Turning to X-rays, the REFLEX, XXL, and eROSITA curves 
all show the increase of mass threshold with redshift characteristic of flux-limited surveys. The 
XXL selection is that of Valaeeas et al. ( ioilh scaled to match the observed density of CI clusters 
in the XMM-LSS field (jPacaud et al.l . |2007| ). while the eR OSITA threshold repr esents a flux limit 



4 X 10 ^^ergs ^, corresponding to 



50 photon counts ( Pillepich et al. . 201 ll ). The mass limit 



is higher by a factor of ~ 3 for clusters reaching 300 photon counts. 

Current wide X-ray samples are largely limited to massive systems at moderate redshifts, but 
narrow/deep samples reaching z ~ 1 and above do exist. By comparison, the SDSS reaches lower 
mass over large areas of the sky, but it only extends to 2; ~ 0.5. RCS-2 reaches z ~ 1, but over 
a smaller (though still quite significant) area. The Planck SZ survey is largely limited to massive, 
moderate redshift systems, while the SPT SZ survey has the best current sensitivity to high redshift 
clusters. In the near future, DES will extend the range of optical identification to 2; ~ 1 over a 
large area, but eROSITA should ultimately produce a larger sample. While DES has a lower mass 
threshold over the range 0.3 < z < 1, the larger (all-sky) area of eROSITA leads to a larger cluster 
total, and eROSITA should continue to detect clusters at z > 1 where the DES sensitivity declines 
rapidly. On a longer time scale, LSST will push the optical selection limit to z w 1.5, increasing 
the number of z > 1 galaxy clusters by one to two orders of magnitude. 

Another proposed method for detecting galaxy clusters is to search for peaks in the weak 
lensing shear field. However, while massive halos produce local shear peaks, shear peak statistics 
are known to suffer from severe projection effects: many peaks arise from the superposition of 
multiple halos along the line of sight. Consequently, shear peak selection is not a particularly 
effective method for selecting clusters of galaxies. That said, the shear peak abundance is an 
observable that can be predicted from numerical simulations in much the same way as the halo mass 



function, and this approach may well yield useful cosmological constraints (e.g., Marian et al. . 20091 : 



Dietrich and Hartlapl . lioiol ) . For the remainder of this review, however, we focus on abundances of 

127 





Figure 25 Selection function for several representative cluster samples, as labelled. The top panels 
show surveys that are completed or currently ongoing. The bottom panels show future surveys. 
Left panels show the limiting mass as a function of redshift, while right panels show the number 
of galaxy clusters above the limiting mass in redshift bins of width Az = 0.1. The yellow region in 
the left panels corresponds to the area in parameter space where one expects fewer than one galaxy 
cluster above the mass and redshift under consideration. For the abundance plot, we consider the 
appropriate area for each of the surveys: 30,000 deg^ for the eROSITA and Planck cluster samples, 
10,000 deg2 for the REFLEX sample, 20,000 deg^ for LSST, 10,000 deg^ for SDSS, 5,000 deg^ for 
DES, 1,000 deg^ for RCS-2, 2500 deg^ for SPT, 600 deg^ for SPTpol, and 4000 deg^ for ACTPol. 
The current ACT survey (not plotted) is similar to SPT, with a somewhat higher mass threshold 
and a 1000 deg^ survey area. Different line types are used only to aid visual discrimination. 
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clusters identified by optical, X-ray, or SZ methods. We emphasize that stacked weak lensing mass 
cahbration of clusters identified by other methods is not equivalent to shear peak statistics, since 
cluster methods use the additional information afforded by baryonic density peaks to drastically 
reduce the impact of projection effects on cluster selection. 

6.3.3. Calibrating the Observable-Mass Relation 

The biggest challenge for cluster cosmology is characterizing the observable-mass relation 
P{X\M, z), where X is a cluster observable that is correlated with mass (e.g., richness, Ysz, 
Lx) and P{X\M, z) is the probability that a halo of mass M at redshift z is detected as a cluster 
with observable X. This relation is usually described by parameters that specify the mean rela- 
tion, the rms scatter, and perhaps a measure of skewness or kurtosis, all of which can evolve with 
redshift. There are three general approaches to determining these parameters: simulations, direct 
calibration, and statistical calibration. 

In the sim ulation approach, one reli e s on numerical simu lations to calibrate the observable-mass 



relation (e.g. Vanderlinde et al. . 2010 : Sehgal et al. . 201 ll ). The main difficulty that simulation 



methods face is our incomplete understanding of baryonic physics, particularly galaxy formation 
feedback processes. These difficulties can be minimized by defining new X-ray observables that 
are expected to be robust to these details, and through careful exploration o f the sensitivity of 



the observab l e-mass relation to the phy sics that goes into the simulations (e.g. iNagai et al.l . 120071 : 
Stanek et all . I2OI0I : iFabjan et al.l . I2OIII ). Nevertheless, we think it unlikely that simulations will 



achieve the ~ 0.5 — 2% level of accuracy required for cluster abundance experiments to become 
statistics dominated in the next ten years. 

The second approach to calibrating the observable-mass relation is the direct method, in which 
a small subset of galaxy clusters have X-ray hydrostatic mass estimates and/or weak lensing mass 
estimates that are taken to represent "true" masses. The observable-mass relation is directly 
calibrated on th i s sma l l subset of galaxy clusters, then applied to the general cluster population 
(jVikhlinin et al.l . I2OO9I : iMantz et all I2OI0I ). Unfortunately, hydrostatic mass estimates are them- 
selves problematic because non-thermal pressure suppo rt (bulk n i otionSj magnetic fields, cosmi c 
rays) is expected to bias them at the 10% — 20% level ( Lau et al. . 20091 : Meneghetti et al. . 2O10l ). 
and it is not clear that these biases can be predicted at the required level of accuracy. We therefore 
suspect that hydrostatic estimates will play a steadily decreasing role in future cluster abundance 
experiments. Weak lensing mass estimates of individual clusters can in principle be unbiased in the 
mean, but they are typically available only for the most massive galaxy clusters in a given sample 
because of limited signal-to-noise ratio. In addition, even if the WL shape noise is small, halo 
orientation and large scale struct ure introduce irreducible no ise in the mass estimates of individual 
clusters at the 20% — 30% level ( Becker and Kravtsov . 20ld ). 

The final approach to calibrating the observable-mass relation is statistical: instead of relying 
on precise mass estimates of a subsample of galaxy clusters, the relation is calibrated using addi- 
tional observables for the full sample that correlate with mass. One such statistical method uses 
the spatial clustering of the clusters themselves, as characterized by the variance of counts-in-cells 



(ILirna and Hul. |2004|) or by the cluster correlation functio n or power spectrum (ISchuecker et al. 



2003 : Majumdar and Mohil . 2004 : Hiitsi and Lahav . 20081 ). Because the bias of halo clustering 
depends on mass (Figure [1]), the amplitude and scale-dependence of clustering provides infor- 
mation about the mass-observable relation. Operationally, one parameterizes this relation, then 
uses standard likelihood i nethods to jo intly fit for both cosmology and the P{X\M^ z) parame- 
ters (|Hu and Cohnl . l200fil : iHoldeil . l200fil ^. These types of analyses are often referred to as "self- 
calibration" because they do not require "direct" mass calibration data. However, we think the 
descriptor "statistical mass calibration" is more accurate. 
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The other statistical method we consider is stacked weak lensing, wherein one measures the 
mean tangential shear of background galaxies around galaxy clusters in a bin of fixed observable. 
In other words, the stacked weak lensing signal is the cluster-shear corr elation function , which 



can be inverted to yield the mean 3-d mass profile of clusters in the bin (jjohnston et al.1 . 120071 ). 



Because this measurement allows one to stack many clusters, one can easily obtain high signal- 



to-noise measurements eve n for low mass clusters and large angular distances (iMandelbaum et al 



20081 : ISheldon et all . l2009l l . Since the underlying halo population is randomly oriented relative to 



the line of sight, stacked weak lensing mass calibration does not suffer from orientation biases so 
long as the cluster identification itself does not preferentially select halos oriented along a particular 
direction or aligned with line-of-sight structure. However, orientation biases in the cluster selection 
method will probably exist to some degree, and they must be calibrated carefully on simulations. 
Finally, because this method relies on stacking all galaxy clusters, it only provides information 
about the mean of the mass-richness relation, so additional data are required to provide tight 
constraints on the scatter. 

Figure [26] shows the error in mass calibration that can be achieved using stacked weak lensing 
for both "Stage III" (left panel) and "Stage IV" (right panel) observations, calculated via the 
methodology described by Rozo et al. ( 2011bl ). Briefly, we assume a source redshift distribution 
appropriate for DES-like survey depth, and we sum over all annuli within the radius 2i?200 5 which 
is a rough approxir aation for the location of th e one-to-two halo transition of the matte r corre lation 
function using the Havashi and White ( 20081 ) model. (Other studies, e.g. Tavio et al. 2008], also 
find that one-halo regime of the mass profile extends well beyond -R200-) For our Stage III estimates 
we assume an intrinsic shape noise o"e = 0.4 and source galaxy surface density fig = lOarcmin"^, 
while for Stage IV we assume Ue = 0.3 and fig = 30arcmin~^. Note that the corresponding 
tangential shear error is ~ (Te/\/2. These values correspond roughly to expectations for DES 
data and Euclid/ WFIRST data, respectively; the lower cTg for the latter refiects higher image quality, 
though the partition of this improvement between de and fig is somewhat arbitrary. LSST falls 
between these two ca ses but closer to Stage IV. We assume that clusters have NFW mass profiles 

the decrease in background source density with increasing 



(Navarro et al, 



(142) 



cluster redshift. In all cases, the redshift distribution is set to 

F{z) oc z^exp [— (z/z*)^] 

with = 0.5. This is appropriate for DES and underestimates the redshift depth for LSST, 
which will result in a slight overestimate of the statistical uncertainties for Stage IV experiments, 
particularly at the highest redshift bins. 

In each panel of Figure [26l dashed red curves show the error from shape noise alone, while solid 
curves include the intrinsic scatter between noiseless WL mass estimates and true three-dimensional 
halo masses, a consequence of non-spherical mass distributions, which we add in quadrature to the 
shape noise assuming an intrinsic scatter per cluster of a„i = 0.3 ( Becker and Kravtsov . 2O10l ). The 
two curves separate when the number of sources is high enough to measure individual clusters with 
S/N~ 3. We assume the stacked weak lensing signal uses all halos within a redshift bin z = ZciO.OS 
and above a given mass threshold as labeled. The forecast mass errors are marginalized over 
concentration. The improvement in precision with decreasing mass is driven by the rapid increase 
in the number of halos as the mass threshold decreases. For mass thresholds 1 - 2 x 10^^ Mg, 
calibration at the 1-2% level is achievable in principle with Stage III data and at the sub-percent 
level with Stage IV data. These are errors per Az = ±0.05 bin, so if one assumes a smooth, 
parameterized evolution of P{X\M, z) it may be possible to constrain the overall normalization 
more tightly. Conversely, some forms of WL systematics (e.g., uncertainty in the shear calibration 
or source redshift distribution) could introduce mass calibration errors correlated across redshift 

130 




Figure 26 Mass uncertainty from stacked weak lensing calibration as a function of redshift, assuming 
only WL shape noise (dashed red curves) and including sample variance due to intrinsic scatter 
between WL mass and halo mass (solid curves). For Stage III data we assume cie = 0.4 and 
fig = 10 galaxies/arcmin^ , while for Stage IV we assume ae = 0.3 and fig = 30 galaxies/arcmin^. 
For both cases we assume a 10^ deg^ survey, and the redshift bin width is z = Zc ± 0.05. Each 
curve corresponds to a different mass threshold as labeled. The blue dotted line shows the mass 
error corresponding to a statistics- limited cluster survey with a threshold mass of 10^^ -^0, as per 
Figure HlJ The intersection between the blue dotted line and the lowest solid black line marks the 
redshift at which a cluster abundance experiment with a threshold mass of 10^"^ Mq transitions 
from being dominated by the statistical error in cluster abundance (at low redshift) to the error in 
the weak lensing mass calibration (at high redshift). 



bi ns. The results in F i gure [261 are broadly consistent with those from the more detailed treatment 



bv lQguri and Takadal ((20111) 



Comparing Figures [2TH and[26l we see that Stage IV weak lensing data can in principle calibrate 
the mean relation well enough that a 10^ deg^ cluster survey would be limited by the statistical 
uncertainty in abundance for z < 0.5, though mass calibration error would dominate at higher 
redshift. (The abundance error and weak lensing calibration error both scale with area as 
The statistics limit from Figure [21] is shown in Figure [26] as the blue dotted line. Stage III weak 
lensing data fall short of this goal by a factor ~ 3, but they can still achieve powerful constraints 
on (Tii^abs(^) (see Figure [28] below). 

The general trends in Figure [26] can be understood using simple arguments. For a singular 
isothermal sphere (SIS) of velocity dispersion ay oc M^/^, the tangential shear is ^{9) = Oe/'^.O, 
where 9 is the angular distance to the cluster center, and 9e is the E instein radius. The Einstein 



radius is related to the velocity dispersion via (jFort and Mellieij . 1 1994] ) 



9e = 4n(^Y^^ 0.07 arcmin ( V ( , (143) 

\ c J Ds V550kms-i/ \ Q.b J ' ' 

where Dg is the distance to the source. Dig is the distance to the source as seen from the lens, 
and we have scaled to a typical value of their ratio. We have also scaled equation ()143p to the 
(1-dimensional) velocity dispersion of a 2 x 10^^ cluster at z = 0.5. Each source galaxy gives a 
low S/N estimate of 7 and hence oi 9e = 2^7. The variance of this estimate is Var(^£;) = 29"^ a1, 
where Ue = \pia^ is the shape noise. The number of source galaxies in a logarithmic angular 
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interval dlnO is 2Trng6'^dln6, so each such interval contributes equally to the S/N on 9^, from ^min 
where the weak lensing approximation fails to ^max, the angular extent of the cluster. The variance 
of the estimate for an individual cluster is thus 

n/]2 2 

Var(^i,) = _ ^ , (144) 

iTrngB^ ln(6'max/fc'min) 

and the variance for clusters is smaller by A^. As representative values we take 9e = 0.07arcmin, 

^min — ^0E — 0.35 arcmin (so '/max — 

0.1), and 

^max — 6.5 arcmin, the angle subtended by a radius 
R = 2i?2oo at z = 0.5 (for M = 2 x yielding ln(6lmax/6'mm) ~ 3. Since Oe oc oc M^/^, 

AlnM = 1.5Aln6'£;, with Aln6'£; = 6'^^[Var(^£;)]^''^- Putting these results together yields a total 
shape noise at z = 0.5 of 

AlnM ^ 1.5A^"^/2 



1/2 



min ) J 

\moJ Vo.3/ V2 X 1014 Moy/ V30arcmin-V V 0-5 / 

This error estimate is 25% smaller than the value plotted in Figure!^ (which shows A In M ~ 0.008 
at z = 0.5 from shape noise alone), in part because the surface density of sources behind the clusters 
is lower than fig, and in part because marginalizing over the NFW concentration parameter further 
increases the mass error. Including the dependence of A^ on mass threshold, equation (jl45p implies 

A In Mshape OC e-^N~^/^ OC M^2/3+a/2 (^47) 

where a is the mass function slope shown in Figure [Ml For a > 4/3, which is always satisfied 
for M > 10^4 Mq the increase in abundance at lower masses outweighs the lower S/N per cluster, 
yielding higher precision at lower mass threshold as seen in Figure [26j To obtain the total noise, 
one simply adds the intrinsic weak-lensing noise cJwiA'^"^''^ in quadrature to the shape noise. 

Multi-wavelength studies of galaxy clusters also allow for statistical mass calibration from cross- 
correlation studies. Just as the clustering of clusters is a mass-dependent observable, so too are 
the abundance functions of different observables. Consequently, overlapping surveys allow for 
the possibility of measuring the abundance of galaxy clusters as a function of two observables 
Xi and X2- While an overall shift in the normalization of the multi-variate observable-mass 
relation P{Xi, X2\M) is still degenerate with cosmology, the addition of the clustering signal — 
which depends on cluster masses directly — allows one to jointly calibrate P{Xi,X2\M) w hile still 



impr oving the cosmological constraints relative to those derived from a single observable (jCunhal . 



20091 ). The improvement is driven by the fact that using two cluster observables s imultaneously 



allow s one to better constrain the scatter of the observable-mass relation (see aslo IStanek et al. 



2010l ). Given the large overlap between many of the currently ongoing or near future cluster 



surveys (e.g., DES fully overlaps with SPT), we expect this type of analysis to become increasingly 
important in the coming decade. 

In practice, the distinction between simulation, direct, and statistical mass calibration is some- 
what artificial. One can use simulation and direct mass calibration to place priors on the observable- 
mass relations, then use statistical methods to arrive at the final constraint. High quality observa- 
tions of individual clusters can provide important information about the scatter of the observable- 
mass relation, a quantity that is only indirectly constrained via statistical calibration methods. 
Conversely, we expect that only statistical methods, and particularly stacked weak lensing, are 
likely to achieve the ~ 1% mass scale accuracy demanded by Stage IV experiments. To the extent 
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that this is true, optical imaging of galaxy clusters will be a necessary component of all future 
cluster surveys, not just for redshifts, but also for cluster mass calibration. Conversely, imaging 
surveys conducted for WL studies of cosmic acceleration will automatically enable cluster studies. 

6.4- Systematic Uncertainties and Strategies for Amelioration 

If X is a cluster observable correlated with mass, and P{X\M, z) the mass-observable relation 
discussed in ^6.3.3| then the expected number of clusters in a volume V at redshift z above a 
threshold Xmin is 

iV(X„,i„z)=/ dX—= dX dMV{z)^P{X\M,z), (148) 

where dn{z)/dM is the halo mass function at redshift z. From equation (I148P we can identify 
several sources of potential systematic uncertainties: errors in cluster redshifts, incompleteness and 
contamination that produce extended non-Gaussian tails of P(X|M, z), the form and calibration of 
the "core" of P[X\M, z), and the theoretical prediction of dn/dM itself. We discuss each of these 
categories in turn. 



6.4- 1- Redshift Uncertainties 

Equation (jl48p implicitly assumes that all clusters are assigned the correct redshifts. As cluster 
samples grow to the tens and even hundreds of thousands, obtaining spectroscopic redshifts for 
all systems becomes impractical, and photometric redshifts are essential. Fortunately, clusters 
con tain many galaxies w ith uniform (red-sequence) colors, allowing precise and accurate photo- 



n's. iLima and Hul (|2007l ) estimated the level at which the bias and scatter of photometric redshift 
errors must be controlled in a Stage III dark energy experiment so as to not degrade cosmological 
information, finding that the rms scatter must be held to az < 0.03 and that any bias in the mean 
photo- 2: must be held bel ow Az = 0.00 3 . Cu rrent cluster photometric redshift estimates have a 
dispersion of « 0.01 (e.g. iKoester et al.l . 120071 ). so controlling the scatter at the 0.03 level is not 
particularly problematic. The bias on the mean is more challenging, but current catalogs do achieve 
close to the necessary accuracy. For instance, the bias of the SPS S maxBCG cat alog, measured by 
comparing cluster photo-z's to spectroscopic redshifts, is ~ 0.004 (IKoester et al. 1,E007). We expect 
these successes will still hold as we push to higher redshifts, so cluster photometric redshift errors 
are unlikely to be a significant source of systematic uncertainty in abundance studies, at least for 
samples below z ~ 1. Above this redshift, the 4000A break feature in the spectrum of early- type 
galaxies red-shifts into the IR, and the photometric redshift accuracy will become more difficult 
to control at the required level unless near IR data are available. X-ray and SZ cluster samples 
require deep multi-band optical imaging and/or spectroscopic follow-up to achieve these errors. 
In particular, while the use of iron lin es in X-ray spectroscopy has proven to a reliable indicator 

mil), the accuracy achieved by these methods is only of order 



of cluster redshift (e.g. lYu et al 



~ 0.03, with a not-insignificant outlier fraction, and even then this requires a significant number 
of photon counts. Nevertheless, for high redshift systems without IR data this information is often 
the only indicator of a cluster's redshift, and it can therefore play a critical role. 



6.4-2. Contamination and Incompleteness: The Tails of P{X\M, z) 

Equation ()148p assumes a one-to-one match between halos and observable clusters. In practice, 
any observed cluster catalog suffers some degree of contamination, the presence of systems whose 
true halo mass is far below the value suggested by the observable X. Cluster catalogs are also 
affected by incompleteness, halos whose corresponding observable X is anomalously low so that 
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they are assigned masses far below their true masses, or perhaps fail to make it into the catalog 
at all. Thus, we can think of contamination and incompleteness as characterizing the extended 
non-Gaussian tails of P{X\M, z). 

Significant levels of contamination and incompleteness can be tolerated provided that they 
are well calibrated. A contamination fraction C increases the estimated cluster abundance by a 
factor (1 + C) relative to the true value, while an incompleteness fraction I reduces the estimated 
abundance by a factor (1 — J). To prevent them becoming the limiting factor in cluster abundance 
measurements, the product (1+C)(1— /) must be determined to a fractional accuracy that is smaller 
than the uncertainty in the cluster space density, roughly N^^^"^ if limited by cluster statistics or 
aA In M if limited by mass calibration uncertainty. 



Contamination can also impact mass calibration (jCohn et al.l . 120071 : lErickson et al.l . l201ll ) . In 



the simplest case, if M is the mean mass of a sample of clusters selected by some range of observable 
and contaminating clusters have mass M <^ M, they dilute the sample and reduce the mean mass 
inferred from calibration by a factor (1 + C). Incompleteness, on the other hand, should not affect 
the estimated mean mass of a galaxy cluster sample, provided that the reason a cluster of given 
X fails to be detected is not correlated with its halo mass. Keeping the impact of contamination 
uncertainty sub-dominant requires that the contamination level be known to AC ~ Aln(l + C) < 
AlnM. This is a stiffer requirement than that on the product (1 — + C), by a factor of a ~ 3, 
so it will be more difficult to achieve in practice. 

Different cluster finding techniques are sensitive to different sources of contamination and in- 
completeness. In X-rays, the principal contaminants are X-ray point sources (AGNs), which can be 
effectively removed from cluster catalogs by demanding that galaxy clusters be detected as spatially 
extended emission. With this cut, the fraction of galaxy clusters where AGNs have a significant 
impact on the cluster emission is < 5% ( Burenin et al. . 2007 : Mantz et al. . 2010l ). The few percent 



contamination level of today's X-ray cluster surveys is not an important systematic relative to 
mass calibration uncertainty. However, the demands will be stiffer for eROSITA, so whether AGN 
contamination will continue to be a negligible systematic in the future remains to be seen. Incom- 
pleteness (in the sense of clusters that reside in non-Gaussian tails) is a source of possible concern, 
since eROSITA will probe significantly lower cluster masses than current X-ray surveys, and the 
regularity of the intracluster medium could break down at lower halo masses because of greater 
importance of radiative cooling or galaxy and AGN feedback. However, Chandra studies of group- 
scale systems sho w that the scaling relations of galaxy clusters extend down to M ~ 4 x 10^^ Mq 
(|Sun et al.l . [2009l l. so eROSITA should be able to use the vast majority of all X-ray selected groups 
and clusters for cosmological investigations. As usual, the largest open question is accuracy of the 
mass calibration. 

Because SZ clusters work in the low S/N limit, with typical detections being ^ 5a, SZ cluster 
samples typically can contain a few false detections — sources that do not correspond to massive 
galaxy clusters but rather reflect the stochastic nature of the CMB and/or instrumental noise. 
However both of these sources of stochasticity can be very well characterized, so we do not expect 
them to be a limiting systematic: their impact on P(X\M, z) is calculable. Radio emission by point 
sources and/or dusty star forming galaxies can systematically reduce the SZ sign a l of cl usters, but 
these effects are expected to fall below the 10% level (e.g., Vanderlinde et al. . 20ld ). Further 



study of the ongoing SZ surveys will better illuminate the impact that such sources can have on 
cosmological constraints from SZ cluster samples. Contamination by intrinsic CMB fluctuations 
and point sources are both mitigated by multi-frequency observations, since the SZ effect has a 
distinct spectral signature. While contamination and incompleteness of SZ samples remains an 
area of active research, we think these effects are unlikely to compete with mass calibration as a 
limiting uncertainty. 
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For optical cluster searches the primary source of contamination is projection effects — two 
or more small halos lining up to produce the apparent galaxy overdensity of a larger halo. These 
projections can arise from truly random superpositions or from galaxies or groups that lie in the 
same filamentary structure but not within the virial radius of a common halo. Even with galaxy 
spectroscopic redshifts, projection effe cts in the op t ical can produce contami nation levels of 5%-20% 
depending on the richness threshold ( Cohn et al. . 2007 : Rozo et al. . 201 lal ). The principal reason 



that projection effects are more important in optical catalogs than in X-ray or SZ catalogs is that 
optical catalogs tend to reach significantly lower mass thresholds at high redshift, which results 
in higher surface densities of clusters and therefore stronger projection effects. In fact, projection 
effects may well set the lower mass threshold at which cosmological analyses with optical clusters 
are possible. We anticipate that incompleteness and contamination can be adequately modeled 
through the use of realistic mock catalogs constructed using numerical simulations, provided they 
are constructed to match the clustering data of the survey under consideration. These mock catalogs 
can be analyzed using the same algorithms applied to the observational data, allowing one to 
quantitatively characterize the impact of projection effects. Many of the most recent optical analysis 
draw on such detailed mock catalogs, but greater accuracy will be needed for next generation 
surveys. 

The impact of contamination on weak lensing mass calibration is somewhat subtle, and probably 
weaker than the naive expectation of depressing the estimated mass by (1 + C) through dilution. 
When superposed galaxy groups masquerade as a single more massive cluster, their projected mass 
distributions are also superposed, and the lensing signal from this blend may be close to the signal 
that would come from a cluster of the combined richness. The net impact must again be evaluated 
with detailed mock catalogs. 

6.4.3. Calibrating the Core ofP{X\M,z) 

In addition to characterizing extended tails of the mass-observable relation, one must calibrate 
the "core" of P{X\M, z), where scatter arises from physical variations in cluster properties at fixed 
halo mass, from observational noise, and from low level contamination that produces small random 
fluctuations in the observable. These effects are typically assumed to produce a log- normal form 
of P{X\M, z), i.e., Gaussian scatter in InX at fixed M. The calibration task is then to determine 
the mean relation {lnX\M, z) and the variance Var(lnX|M, z), and to characterize any deviations 
from log-normal form that are large enough to affect the predicted abundance. As the notation 
indicates, the relation can evolve with redshift, and the scatter and non-Gaussianity may depend 
on halo mass at fixed redshift. 

We consider each of the relevant terms in turn, starting with the mean observable-mass relation. 
We have already expressed our view that statistical calibration methods, and stacked weak lensing 
in particular, a re the most promisi ng r oute to meeting the string ent demands of next-generation 
cluster surveys. iGunha et all (j2009l ) and lOguri and Hamanal (j201ll ) show that this approach allows 



the mass and redshift dependence of (lnX|M, z) and Var(lnX|M, z) to be parameterized in an 
extremely flexible way while retaining enough information to yield strong cosmological constraints. 

If the mean mass-observable relation is calibrated using stacked weak lensing, then the sys- 
tematic effects discussed for WL in §5.71 are also sources of uncertainty for cluster studies. In 
particular, errors in the source galaxy redshift distribution and/or shear calibration will shift the 
inferred cluster mass scale. For these systematics to be insignificant, the rule of thumb is that 
the uncertainty in the mean inverse critical surface density of the source galaxies and the 

error in the shear calibration must be smaller than the mass errors plotted in Figure [26} divided 
by 1.5. The 1.5 factor comes in because an error in or shear calibration uniformly biases 

the recovered cluster density profile and therefore biases the estimate of -R200- A bias h in the mass 
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at a fixed aperture becomes roughly a bias b^'^ in the estimated virial mass. Typicahy, a system- 
atic error Az in the mean redshift of sources produces a corresponding error ~ Az/2 in (S-Jt). 
Recent work suggests that controlhng photom etric redshifts at the level required for weak lensing 
mass calibration of galaxy clusters is possible dSheldon et alJ . l2nilh . Importantly, because cluster 
weak lensing depends on the mean tangential shear around cluster centers, some forms of cosmic 
shear systematics are automatically averaged away and therefore not relevant for weak lensing mass 
calibration of galaxy clusters. For instance, errors that are coherent on scales larger than cluster 
diameters (typically a few arcmin) but incoherent on still larger scales will be averaged out in a 
stacked lensing measurement. Moreover, because the weak lensing signal about galaxy clusters 
is stronger than cosmic shear, uncertainties that appear for very low shear values (e.g., additive 
biases) are less important. All in all, the demands on weak lensing systematics for stacked weak 
lensing calibration of galaxy clusters are likely to be lower than those for cosmic shear. 

There are some systematics specific to stacked cluster lensing, the most significant of which 
is cluster mis-centering. If the observationally determined center of a cluster does not match the 
location of the center of the dark matter halo that one would select in simulations, then the observed 
mean tangential shear about the assigned center will differ from the theoretical expectation. Cluster 
mis-centering should not be problematic in X-ray experiments with high angular resolution, as gas 
in hydrostatic equilibrium traces the underly ing gravitati o nal p otential. While a few exceptions 
will arise, such as the famed Bullet Cluster ( Clowe et al. . 20061 ). the frequency of these systems 
is low. For similar reasons, centroiding of SZ systems is expected to be fairly robust. The mis- 
centering problem is most difficult in the optical, where the center is typically chosen to be a 
specific galaxy but the choice of galaxy is not necessarily obvious. Mis-centering is currently one 
of the doi ninant systemat i cs in stacked cluster lensing, introducing uncertainties at the ~ 5% — 
10% level (j Johnston et al.l . 120071 ) . There are ong oing efforts aimed a,t imp roving cluster centering 



(George et al. and Rykoff et al., in preparation). lOguri and Takadal (|201ll ) find that marginalizing 



over parameters that describe mis-centering does not significantly dilute the cosmological power 
of cluster abundance studies, so it may be that future analyses will simply treat mis-centering via 
an additional set of nuisance parameters. Alternative weak len sing estimators can be con structed 



to avoid mis-centering biases in the inner regions of clusters ( Mandelbaum et al. . 2O10l ). Other 



potential biases that affect stacked cluster lensing are modulation of the source population by lensing 
magnification, non-li near shear corrections, and source dens ity modulation due to obscuration by 
cluster members (see Rozo et al. 2011b : Hartlap et al. 201ll ). These effects can also have impact 
on cosmic shear experiments. 

Turning to scatter, we can show that the magnitude of the variance Yar(\nX\M, z) is degenerate 
with the mass scale through a simple argument. Suppose the observable of interest is a mass 
estimator X = Mobs, where the subscript indicates the observationally estimated cluster mass. 
The observed abundance is 

(rnry f flTi 

d\nM .. . . P(Mobs|M,z). (149) 



din Mobs 



d\uM 



For a power-law mass function dn/dlnM = AM^°^ = Aexp{—a\n M) and log-normal scatter of 
variance o"^ = ((In Mobs — InM)^), one can readily compute the observed abundance by completing 
the square, finding 



dn 



Aexp 



-a In Mobs - ^"^^^ 



(150) 



din Mobs 

From equation (|150p it is evident that a shift in mass A In M is degenerate with a shift in the 
var iance Au^ = 2a~^A In M. (For a more rigorous argument that arrives at the same conclusion, 
see iLima and IIull2005l .) Thus, if the mass scale is controlled with an accuracy AlnM, then the 
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scatter must be controlled with an accuracy Acj^ = 2a~^ Aim M . If we further set Ao"^ = 2a Aa, 
we arrive at Aa = a^^a~^AlnM. The fractional accuracy with which a must be known to avoid 
competing with AlnM scales as c"^, so the requirement is much less demanding if the scatter is 
smaller to begin with. As an illustrative example, we set a = 3 and a = 0.2, which is roughly 
appropriate for SZ and likely slightly optimistic for optical. We find that the uncertainty due to 
errors in the scatter becomes comparable to that from errors in the mass when Aa ~ 1.7AlnM. 
For Stage III experiments with weak lensing calibration, AlnM ~ 2%, so the scatter needs to be 
known at the Act ~ 0.04 level, a value in agreement with the mor e rigorous estimate by iRozo et al 



(see, e.g. 



Rvkoff et al.l . l201lh . If Stage IV 



(l2011bl ) and likely to be achievable in the near future 
experiments reach 0.5% precision, the corresponding uncertainty in the scatter must be below 0.01 
(absolute, not fractional), which is difficult to achieve from an ab initio calculation but may be 
possible with statistical calibration methods. 

Finally, we must consider the possibility that, in addition to extended tails reflecting contam- 
ination and incom pleteness, the core of P{X\M, z) deviates from log-normal form. This problem 
was considered by IShaw et al.l (120081 ) . whose discussion we paraphrase here. An observable-mass 
relation can be approximated by 
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known as the Edgeworth expansion. Here G is a Gaussian of zero mean and unit standard deviation, 
X = (In AT — (In A))/[Var(ln A|M, z)]^/^, 7 is the skewness of the distribution, and k is the kurtosis. 
For a power-law mass function dn/d\nM oc M~°', it is straightforward to check that the resulting 
cluster abundance is 
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where {dn/dX)^ is the abundance for a purely log-normal distribution. (Note that this a is also 
the logarithmic slope of the cumulative halo mass function d In N/d In M that appears in our earlier 
discussion.) 

Setting a = 3 and assuming 10% scatter for X-ray masses, a 3% correction to the abundance — 
equivalent to a 1% correction in the mass — requires extreme non-Gaussianity with 7 ~ 7 or k ~ 90. 
However, numeri cal simulations pr edict di stributions of X-ra y observables that are close to log- 
normal (see, e.g.,^^^^^Eir3H , Fig. 8: lFabjan et al.ll201ll . Fig. 3). We therefore do not expect 
X-ray studies to be sensitive to departures from a log-normal P(A|M, z). For [Var(lnX|M, -z)]^/^ = 
0.2, typical for SZ and probably achievable for optical, a 3% abundance change arises from 7 ~ 0.8 
or K ~ 6, still quite large deviations from Gaussianity. For [Var(ln X|M, z)]^/^ = 0.4 these numbers 
drop to 0.1 and 0.35, respectively, so with this level of scatter a moderate degree of non-Gaussianity 
can have noticeable impact on the predicted abundances. For example, a Poisson distribution for 
a cluster with {N) = 10 galaxies corresponds to a skewness 7 ~ 0.3. This discussion demonstrates 
the value of finding improved optical richness estimators that have lower scatter relative to mass 
(|E,ozo et al.l . 120091 : lEvkoff et all . l201lh . 



Figure [271 shows the impact that various elements of P{X\M, z) can have on the recovered 
cluster counts. For illustrative purposes, we assume that X is an observed mass and show the 
change in the observed mass function due to changes in P{Mohs\M, z). For our reference model, 
we assume Mobs is unbiased and has log- normal scatter a = 0.2, and we compute the cumulative 
cluster counts above Mobs for our fiducial cosmology at z = 0.6 in a redshift bin of width Az = 0.1. 
Results at other redshifts are qualitatively similar. 
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Figure 27 Relative change in cluster abundances at z = 0.6 as a function of mass due to a 2% 
bias in the mass (AlnM = 0.02), raising the log-normal scatter a from 0.2 to 0.25, or introducing 
skewness 7 = 1 in P{X\M,z) (solid, dashed, and dot-dashed curves, respectively). The statistical 
error in number counts for A = 10^ deg'^ is shown by the dotted line. The sensitivity of A In 
to systematic errors in the mass, scatter, or skewness can be estimated using the rule-of-thumb 
approximations in equations (1153p - (|155p . 

Solid, dashed, and dot-dashed curves show the change in the cumulative number counts AlniV 
if Mobs is biased by 2% (AlnM = 0.02), if the scatter is increased from a = 0.2 to cr = 0.25, or 
if the skewness is increased from 7 = to 7 = 1 using the Edgeworth expansion. For reference, 
we also show the statistical error on the cluster counts for A = 10^ deg^ as a dotted line. The 
details of P(X|M, z) affect the recovered cluster counts, and the impact is larger at higher masses 
than at lower masses. Moreover, the relative impact of skewness to scatter and of scatter to bias is 
mass dependent, with lower masses being more robust to uncertainties in the scatter and skewness. 
This is as expected: the shallower the slope of the mass function, the less important the details 
of P(X|M, z). The systematic offsets in Figure [271 are well approximated (to 10% and 30% 
for scatter and skewness respectively) by the rule-of-thumb calculations we have described above, 
specifically 

AlnTVpredicted = aAlnM, (153) 
A In /^'predicted = a^o-Afj , (154) 

AlniVprcdicted = ^aV3A7 . (155) 

Given the values of A In and a expected for a survey (Fig. [Ml typical values AlniV ?a Af-V2 ^nd 
a ~ 3), one can use equations (|153p - ()155p to infer the uncertainties AlnM, Act, and A7 required 
to keep a cosmological analysis limited by abundance statistics. 

6.4-4- Theoretical Systematics 

Predicting observed cluster counts via equation ()148|) requires knowledge of the halo mass 
function dn/dM for any cosmological model under consideration. If the fractional uncertainty 
in dn/dM exceeds the observational error in cluster counts AlnA^, or if the equivalent mass scale 

138 



uncertainty exceeds the mass calibration error A In M, then cosmological const raints will be h mited 
by theoretical uncertainty rather than by observational errors. The study of Tinker et al. ( 20081 ) 



finds agreement in dn/dM at the < 5% level among multiple simulations by different groups for a 
ACDM cosmological model with WMAP3 parameters. This is roughly the level required for large 
area surveys of M > 4 x 10^^ Mq clusters in Az = .1 bins, though higher a c curacy is neede d 
for lower mass thresholds (for detailed drscuss.on see ICunha a,nd Evrard f B: ^^^T^T^ B . 
The formula (p9|) describes Tinker et al.'s z = results accurately, but at redshifts z = 0.5 — 2.5 
they find deviations of ~ 10 — 30% from this "universal" prescription. While these deviations are 
themselves numerically calibrated, their existence suggests that the mass function may depend on 
the dark energy model even when expressed in terms of the ct(M) relat ion as in equation (39|). In 



addition, consistency in halo definitions is clearly critical. For instance, iBhattacharva et al.l (|201ll ) 
find that mass functions in their suite of wCDM simulations — which are calculated using friends- 
of-friends halo finders — deviate by up to 10% from a fitting formula calibrated on their ACDM 
simulation suite. It seems likely that Stage III and certainly Stage IV ex periments will need to 
move to emulator based methods with comprehensive N-body libraries (e.g. Lawrence et al. . 20ld ) 
rather than simple fitting formulae. 

While further N-body work is needed to interpret future surveys, dark matter evolution is 
straightforward in principle, and the problem should yield to sufficient applications of computational 
force. Baryonic evolution is potentially a thornier issue. Some X-ray studies suggest a depletion of 
baryonic mass (stars -|- hot gas) relative to the universal VLi^/^ra ratio by 20 — 30% within the A 



500/9c radius, with systematically larger depletion in less massive clusters (e.g.. lGiodini et al. . 20091 ). 
For ilfe/r^m = 0.17, a 20% deviation in baryonic mass is a 3.4% deviation in total mass, and thus 
comparable to or larger than the statistical mass calibration errors achievable with stacked weak 
lensing (Figure [26|) . as well as the precision required to achieve the statistical limits of large cluster 
surveys f Figurel21d). Hydrod ynamic simulations can explain baryon depletions comparable to those 
observed ( Young et al. . 2010l ). but the magnitud e and even the sign of the baryonic effects depend 
on the star formation and feedback physics (e.g.. IStanek et al.ll2009l : ICui et al.ll201ll ). Furthermore, 
because the baryons influence the dark matter profile, they can have substantial impact (~ 15%) 
on the total mass within a hig h over density thres hold (e.g., the A = 500yOcrit threshold frequently 
adopted in X-ray analyses; see Stanek et al. 20091 ). In all of these simulations the corrections are 
smaller at larger radii, so defining halo boundaries at lower overdensity (such as the A = 200yO 
convention used here) is beneficial in this respect. 

It may be possible to calibrate baryonic effects well enough with simulations and detailed 
observations of selected systems to remove them as a source of systematic uncertainty, but this 
problem will require concerted effort, particularly when Stage IV experiments get underway. By 
the same token, if stacked weak lensing is the primary mass calibration tool, then one must also 
develop robust theoretical models for predicting the weak gravitational lensing signal, which in 
turn requires that the halo-mass correlation function be characterized at the same level as A In M . 
Current analytical models are accurate only at the ~ 10% — 20% level (jHayashi and Whitel . I2OO8I I 
so this is another area that requires further theoretical study. 

A final ca veat related to the halo rnass function is that primordial non-Gaussianity could alter 
its form (e.g., Weinberg and Cole , 19921 : Dalai et al. , 2008 ; Grossi et al. , 2009 ; LoVerde and Smith , 



2011: D'Amico et a 



201 ll) and thereby change the c l uster abundances predicted for a given dark 



energy model (e.g., Cunha et al. . 20ld : Pillepich et al. . 2011 ). Of course, evidence for non-Gaussian 
initial conditions would be exciting in its own right, with important implications for early-universe 
physics. However, it appears that the levels of non-Gaussianity that would have significant impact 
on cluster abundances are already ruled out by ot her constraints, uri l ess one allows the magn itude of 



the non-Gaussianity to be scale-dependent (e.g.. lHoyle et al.l . I2OIII : iParanjape et al.l . I2OIII ). Given 
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the strong theoretical prior for Gaussian initial conditions and the multiple observational probes 
that could detect and characterize primordial non-Gaussianity if it exists, we think it unlikely that 
non-Gaussianity will limit the power of cluster abundances as a probe of dark energy and modified 
gravity. 

6.5. Space vs. Ground 

As discussed in §6.21 X-ray observations, possible only from space, have played a central role in 
nearly all cluster cosmological studies to date. The iJOS'^T All-Sky Survey has been the basis for 
many of the cluster samples used in these studies (Table H]) . Pointed observations with a variety 
of telescopes, especially XMM-Newton and Chandra, have been the basis of mass calibration for 
X-ray observables and the source of most empirical knowledge about the physics of the intracluster 
gas. Ongoing XMM-Newton surveys will expand the dynamic range and size of X-ray catalogs over 
the next few years. The most important advance will come with the eROSITA mission, which 
should produce the definitive all-sky survey of massive (M > 4 X 10^"^ Mq) clusters out to z 1, 
with an extended tail of higher redshift clusters reaching z ~ 2. Follow-up X-ray studies at higher 
angular resolution will help better assess point-source contamination and will improve the mass 
calibration of the eROSITA catalog. For comparable numbers of clusters. X-ray catalogs offer 
significant advantages over SZ or optical catalogs because of the low scatter expected between X- 
ray observables and halo mass, which reduces sensitivity to uncertainties in the width and form of 
the observable-mass relation ( §6.4.3p . 

For SZ searches, ground-based telescopes have higher sensitivity than space observatories be- 
cause of their larger collecting area and higher angular resolution. The larger beam size of the 
Planck observatory (~ 5 arcmin) relative to SPT and ACT (~ 1 arcmin) reduces its ability to 
detect high redshift systems. Nonetheless, the all-sky nature of Planck observations is an impor- 
tant asset, and the Planck catalog of high mass clusters will be useful both for direct cosmological 
constraints and for cross-correlation studies with clusters identified at other wavelengths. Thus, 
we consider the Planck, SPT, and ACT surveys as highly complementary. Any future CMB space 
mission designed to probe inflation physics and primordial gravity waves would also produce a 
much more sensitive all-sky SZ cluster catalog, provided it achieved high angular resolution. 

Turning to optical searches, space observatories provide little advantage for cluster detection at 
z < 1, since cluster detection does not gain much from the improved image resolution achievable 
from space. However, as discussed in ^6.3.21 space-based near-IR imaging is highly desirable for 
extending (rest-frame) optical cluster cat alogs to z ~ 2. In thenear future, s uch searches will rel y 
on Spitzer data, as in the case of ISCS dEisenhardt et all . I2OO8I ). SpARCS (|Wilson et al.l . l2006l ). 



and the recently approved 100 deg^ SpitzerSY'T Deep Field. Additional IR data is or will soon be 

available from surveys like VHS, UKIDSS, and WISE, though it is unclear whether these surveys are 

deep enough to allow for high redshift cluster finding. The VIKING survey, covering ~ 1500 deg^, 

should be sufficiently deep to allow for cluster detection at z > 1. In the longer term, IR imaging 

from Euclid and/or WFIRST could make a key contribution to high redshift cluster surveys. High 

redshift cluster detection should also be feasible with extremely deep optical imaging from the 

ground, like that planned for LSST, which should reach z ~ 1.5. 

In the long run, however, the most important contribution of space observations to cluster 

cosmology will come via weak lensing mass calibration rather than cluster finding. The statistical 

_— 1/2 _ 

error of WL mass calibration scales as , where rig is the source surface density. As can be 

seen from Figure [26l a surface density fig k, 30 arcmin"^ is required to reduce mass calibration 
error below the statistical abundance error, and even then only for z < 0.5. This source density is 
expected for an optical space mission like Euclid, but it is probably higher than can be achieved by 
ground-based observations, even with the depth and image quality of LSST. The cluster counting 
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error and mass calibration error both scale with survey area as ^4"^/^, so the area effect cancels 
out if the cluster and WL surveys overlap completely. If the cluster survey covers a larger area 
(e.g., the all-sky eROSITA catalog), then the WL source density required to saturate the halo 
statistics limit is even higher. Reaching the calibration accuracy allowed by the source galaxy 
statistics also requires excellent control of shape measurement systematics, generally expected to 
be lower from a space-based platform, and photo-z systematics, which probably require space-based 
IR imaging to achieve the stringent demands implied by Figure [26j More generally, if the error in 
WL mass calibration sets the ultimate limit of cluster measurements of fluctuation growth, as we 
have speculated it will, then the achievable error on cTii^abs(-z) scales as fig or as (A7)gyg if the 
WL measurements are themselves limited by a shear measurement systematic (A7)sys. 

6. 6. Prospects 

We expect cluster abundance studies to undergo substantial and steady improvements over the 
next decade and beyond. In the near term (< 3 years), we anticipate advances in X-ray, SZ, and 
optical cluster studies. The XMM Cluster Survey (XCS) and XMM XXL Survey will yield much 
larger X-ray cluster samples at z > 0.3. Planck will produce the definitive all-sky SZ catalog of 
massive clusters out to z < 0.7, while SPT and ACT will probe z > 0.7 cluster populations over 
thousands of square degrees for the first time. In the optical, continuing studies with the SDSS 
will lead to improved cluster finders and richness estimators, as well as improved weak lensing 
calibration thanks to better centering and better source photometric redshifts. On a comparable 
time scale, the RCS-2 survey will obtain g, r, and z imaging to a nominal depth of r ~ 24.8 (roughly 
2 magnitudes deeper than SDSS) over 1000 deg^, yielding the first large area optical cluster catalog 
extending to z ~ 1. Relative to the results shown in Figure \20\ these X-ray, SZ, and optical studies 
will improve the low redshift a^-^m constraint and extend it, at somewhat lower precision, to 
z ~ 0.5 — 1. At the same time, improved calibration and cross-checks among surveys will test for 
and reduce remaining sources of systematic error. 

In the medium term (~ 3 — 8 years), several new optical surveys will cover thousands of 
deg^ with greater depth than SDSS and larger area and/or more photometric bands than RCS-2. 
These include the Kilo-Degree Survey (KIDS, 1500 deg in ugriz), DES (5000 deg2 in grizY), PSl 
(15,000 deg^ in grizY), and the Hyper-Suprime Camera survey (HSC, 1500 deg2 in grizY). These 
surveys should significantly improve the cosmological constraints from RCS-2 thanks to higher 
cluster numbers, lower statistical errors in weak lensing mass calibration, and better control of 
photometric redshift uncertainties. The VIKING survey will cover 1500 deg2 at near-IR wave- 
lengths [ZY JHKg) at sufficient depth to allow cluster identification and accurate photometric 
redshifts at z = 1 — 2. In addition, all of these surveys will overlap with Planck^ and often with 
either the ACT or SPT surveys, which can further enhance the utility of both sets of catalogs. DES 
in particular is designed to cover the entire footprint of the SPT SZ survey. 

With launch expected 2013-2014, eROSITA will produce the ultimate all-sky catalog of massive 
clusters (see §6.5p . The optical imaging surveys will allow weak lensing calibration of the eROSITA 
mass-observable relations, with multiple independent surveys affording larger overlap area and 
thus more precise calibration. This combination of X-ray selection and optical WL calibration 
offers bright prospects for the coming decade of cluster cosmology. Optical surveys will further 
extend this leverage by probing cluster abundances to masses below those probed by eROSITA. 

On a longer timescale, LSST plans to image 20,000 deg2 of high-latitude sky in six bands 
(ugrizY), with each single pass comparable in depth to the medium-term surveys described above 
and co-added data reaching 2.5 — 3 magnitudes deeper. The increased depth of LSST should allow 
one to cleanly select galaxy clusters out to z ~ 1.5. While the greater dynamic range of the cluster 
catalogs will be an asset in itself, LSST's most important contribution to cluster cosmology will 
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Figure 28 Error on crii^abs(^) achievable by measuring cluster abundances in a redshift bin z = 
Zc ± 0.05 in a 10^ deg^, assuming mass calibration via stacked weak lensing with Stage III or Stage 
IV source densities. We assume that all geometric cosmological parameters — most significantly the 
comoving volume element and the matter density parameter — are held fixed, being effectively 
constrained by a joint CMB+SN+BAO+WL experiment. Also shown for comparison are the 
forecast constraints on cth. abs (-2) derived from such a joint analysis using our fiducial Stage III and 
Stage IV surveys, assuming a WQ-Wa parameterization of dark energy and allowing deviations from 
GR parameterized by Gg and A7 (see §8.41 for details). 

be in the form of improved WL mass calibration, both for eROSITA and for LSST's own clusters. 
Euclid could provide even better WL calibration over a similar sky area, while WFIRST should 
achieve a high WL source density but over a smaller survey area. The IR sensitivity of Euclid 
and/or WFIRST should also enable cluster searches at z ~ 2 and beyond. 

We have argued throughout this section that mass calibration will be the likely limiting factor in 
cluster studies of cosmic acceleration, and that stacked weak lensing is the most promising avenue 
to achieve accurate mass calibration. Figure [28] combines information from Figures [22] and [26] 
showing the fractional error on o"ii^abs(-z) in = 0.1 bins that can be achieved with a 10^ deg^ 
cluster survey, using the WL mass calibration errors we have forecast for Stage III (left panel) or 
Stage IV (right panel) source densities. With Stage III lensing calibration, errors on fTii^abs(^) are 
below 1% at z ~ 0.5 for cluster mass thresholds of 1 — 2 x IO^^'Mq, and ~ 1.5% for a mass threshold 
of 4 X IO^^Mq. With Stage IV lensing calibration, the peak sensitivity is better than 0.5% for the 
lower mass thresholds and better than 1% for the 4 x lO^^M© threshold. 

The additional red and blue curves in Figure [28] show the forecast constraints on (Ju^absiz) for 
a fiducial Stage III (blue) or Stage IV (red) program combining SN, BAO, WL, and CMB data as 
discussed in ^ These forecasts assume wq — Wa dark energy model and allow departures from 
GR-predicted growth described by an overall multiplicative offset Gg and a growth index deviation 
A7 (see ^2.2p . The fiducial programs are defined in ^8.11 If WL systematics are controlled at the 
level assumed in these fiducial programs then they should be negligible for cluster mass calibration 
relative to statistical errors, so we have not included them in computing AlnM. 

From Figure [28] we see that a lO'* deg^ cluster survey with Stage III WL calibration data can 
easily exceed the <Tii^abs(-2) precision expected from the Stage III CMB+SN+BAO+WL program, by 
as much as a factor of ~ 3 for a threshold of 10^^ Mq. Similarly, cluster constraints with Stage IV WL 
calibration improve on the fiducial Stage IV crii^abs(-2) precision without clusters by a factor of ~ 2. 
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The visual impression that clusters can outperform the fiducial program only at z ~ 0.4 — 0.8 but 
perform worse at high and low redshifts is artificial, since the CMB+SN+BAO+WL curves assume 
a smooth growth model while the cluster constraints in Figure [28] are those that can be achieved 
from galaxy clusters within each individual redshift bin. For Figure [28] we have assumed that 
errors on 0.^ and dVc{z) are negligible. While the assumption for dVc{z) should prove reasonably 
accurate, the forecast CMB+SN+BAO+WL errors on Qrn (4% and 1% for Stage III and Stage IV, 
respectively) are larger than our assumed WL mass calibration errors for M < 2 x W^^Mq (see 
Figure [26]) . In practice, therefore, the fractional errors in Figure [28] would apply not to fTii^abs(-2) 
but to the parameter combination cn abs('2)f^m, with q ~ 0.4. We return to these points in ^8.41 
below, where we discuss the improvements in constraints on the dark energy equation of state and 
on Gg and A7 achievable with clusters. 

If some alternative mass calibration method proves better than stacked weak lensing, then 
the situation could be even better than Figure [28] suggests. This would be especially true for 
Stage III, where the WL source density is the clear limiting factor on the overall error. For our 
assumed Stage IV source density, the uncertainty from WL mass calibration is already close to 
the statistical uncertainty in cluster counts at z < 0.6. Conversely, the situation would be worse 
than Figure [28] suggests if some other systematic uncertainty — e.g., contamination, miscentering, 
theory, or WL photo-z calibration — makes it impossible to achieve the statistical limits of the WL 
mass calibration. 

In summary, our analysis indicates that cluster abundances with masses calibrated by stacked 
weak lensing could provide strong tests of cosmic acceleration models, beyond those afforded by 
the 2-point WL statistics described in f|5j However, achieving this potential requires that mass 
calibration uncertainties be controlled at the 1 — 3% level for Stage III and at the 0.5 — 1.5% level 
for Stage IV. We see no obvious show stoppers, but the challenge is a demanding one. 
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7. Alternative Methods 

In §53][6l we have reviewed in detail the four observational methods that have been most widely 
discussed, and applied, as probes for the origin of cosmic acceleration. We now review more briefly 
some of the other techniques for testing cosmic acceleration models. In some cases, surveys con- 
ducted for SN, BAO, or WL studies will automatically provide the data needed for these alternative 
methods. For example, redshift-space distortions ( ^7.2p and the Alcock-Paczynksi effect ( ^7.3p can 
be measured in galaxy redshift surveys designed for BAO measurements, and synoptic surveys de- 
signed for Type la supernovae will discover other transients that might provide alternative distance 
indicators ( §7.4p . Just as cluster investigations will increase the cosmological return from WL sur- 
veys, these methods will increase the return from BAO or SN surveys. The potential gains are large, 
but they are uncertain because the level of theoretical or observational systematics for these meth- 
ods has not yet been comprehensively explored. In §8.51 we will examine how precisely our fiducial 
Stage III or Stage IV CMB-|-SN-|-BAO+WL programs predict the observables of these methods, 
setting targets for the precision and accuracy they should achieve to make major contributions to 
cosmic acceleration studies. 

Some of the other methods described below require completely different types of observations or 
experiments, falling outside of the "survey mode" that characterizes the methods we have discussed 
so far. Compared to the combined SN+BAO-I-WL+CL approach, these methods may yield more 
limited information or be sensitive only to certain classes of acceleration models, but they can 
provide high-precision tests of the standard ACDM model, and they could yield surprising results 
that would give strong guidance to the physical origin of acceleration. 



7.1. Measurement of the Hubble Constant at z ^ 

As emphasized by Hu ( 20051 ). a precise measurement of the Hubble constant allows a powerful 
test of dark energy models when combined with CMB constraints. In effect, the CMB and Hq 
provide the longest achievable lever arm for measuring the evolution of the cosmic energy density, 
from z ~ 1100 to z = 0. The sensitivity of Hq to dark energy is illustrated in Figure O which 
shows that a change Aw = ±0.1 alters the predicted value of Hq by 5% in fiyt = models that are 
normalized to produce the same CMB anisotropics. More generally, a low redshift determination 
of the Hubble constant combined with P/anc/c-level CMB data constrains w with an uncertainty 
that is twice the fractional uncertainty in Hq, assuming constant w and flatness. The challenge for 
future Hq studies is to achieve the percent-level statistical and systematic uncertainties needed to 
remain competitive with other cosmic acceleration methods. 

One of the defining goals of the Hubble Space Telescope was to measure Hq to an accuracy of 
10%. The Hq Key Pro ject achieved this goal, with a final estimate Hq = 72 it 8 km s~^ Mpc~^ 
(|Freedman et al.l . l200lh . where the error bar was intended to encompass both statistical and system- 
atic contributions. This estimate used Cepheid-based distances to relatively nearby {D < 25Mpc) 
galaxies observed with WFPC2 to calibrate a variety of secondary distance indicators — Type 
la supernovae. Type II supernovae, the Tully-Fisher relation of disk galaxies, and the fundamen- 
tal plane and surface-brightness fluctuations of early-type galaxies. These secondary indicators 
were in turn applied to galaxies "in the Hubble flow," meaning galaxies at large enough distance 
(Z) ~ 40 — 400 Mpc) that their peculiar velocities Vp^c did not contribute significant uncertainty 
when computing Hq = v/d. The Cepheid period- luminosity {P — L) relation was calibrated to 
an adopted distance modulus of 18.50 it 0.10 mag for the Large Magellanic Cloud (LMC). The 
uncertainty of the LMC distance itself and the uncertainty in adjusting the LMC P — L relation to 
the higher characteristic metallicities of calibrator galaxies were both important contributors to the 
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final error budget. Another important systematic was the uncertainty in differential measurements 
of Cepheid fluxes over a wide dynamic range along the distance ladder. 

A number of subsequent deve l opments have allo wed substantial improvements in the mea- 
surement of Hq ( Riess et al. 20091 : Riess et al. 2011 . and references therein), with the most re- 
cent deterni i nation of Hq = 73.8 it 2.4 km s~^ Mpc~^ yielding a la uncertainty of only 3.3% 
(jRiess et al.l . 120111 ) . O ne important change is a shift to Cepheid cali bration based on the maser 



distances to NGC 4258 (jHerrnstein et al.l.ll999l:lHumphrevs et al 



20081 ) and on parallaxes to Galac- 



tic Cep heids measured with H ipparcos (jvan Leeuwen et al.l . 120071 ) and with the HST fine-guidance 
sensors (jBenedict et al.l . l2007l ). These calibrations circumvent the statistical and systematic uncer- 
tainties in the LMC distance, and they directly calibrate the P — L relation in the metallicity range 
typical of calibrator galaxies, albeit with a sample of only ~ 10 stars in the case of Milky Way 
parallaxes. A second improvement is more than doubling the sample of "ideal" Type la SNe — 
with modern photom etry, low-reddeni n g, typ ical properties, and caught before maximum — from 
the three available to iFreedman et al.l ( 200ll ) to eight. Of all secondary distance indicators. Type 
la supernovae have the smallest statistical errors, and probably the smallest systematic errors, and 
they can be tied t o large samp les of supernovae observed at distances that are clearly in the Hubble 
flow. Riess et al. (j2009l . 2011 ) use Type la supernovae exclusively in their Hq estimates. Third, 
Cepheid observations at near-IR wavelengths (1.6 microns) have reduced uncertainties associated 
with extinction and the dependence of Cepheid luminosity on metallicity. Finally, relative cali- 
bration uncertainties of Cepheid photometry obtained with different instruments and photometric 
systems along the distance ladder have been mitigated by the use of a single instrument, HSVs 
WFC3 for a large fraction of the data. 

Over the next decade, it should be possible to reduce the uncertainty in direct measurement of 
Hq to approach the one-percent level. One crucial step will be the 1% to 5% parallax calibration 
of hundreds of long-period Galactic Cepheids within 5 kpc by the Gaia mission, setting the funda- 
mental calibration of the P — L relation and, to some degree, its metallicity dependence on a solid 
geometrical base with distance precision easily better than 1%. Discovery of additional galaxies 
with maser distances (like NGC 4258) may also improve the Cepheid calibration o r, if they are in 



the H ubble flow, may provide a direct determination of the Hubble constant (see iGreenhill et al 



20091 ). The other key step will be the Cepheid calibration of more Type la supernovae, which occur 
at a rate of one per 2 — 3 years in the range D < 35Mpc accessible to HST with WFC3. JWST 
could increase this range to D < 60Mpc, quadrupling the rate of usable supernovae. Ultimately 
a sample of 20 to 30 calibrations of the SN la luminosity is needed to reduce the sample size 
contribution to uncertainty in Hq below 1%. With firmer P — L calibration and a larger Type 
la sample, the remaining uncertainty in Hq is likely to be dominated by systematic uncertainty 
in the linearity of the photometric systems observing nearby and distant Cepheids. This may be 
minimized by the careful construction of "flux ladders," analogous to distance ladders but used to 
compare the measurements of disparate flux levels. Additional contributions to the determination 
of Hq with few percent precision could come from "golden" lensing systems, Tully-Fisher distances 
measured in the far infrared, surface brightness fluctuation measurements further into the Hubble 
flow, Sunyaev-Zel'dovich effect measurements, and local volume measurements of BAO. 

We discuss the potential contribution of Hq measurements to d ark energy c o nstra ints in ^ ^8.31 
and 18.51 below. Already, the combination of the 3% measurement of Riess et al. ( 201ll ) with CMB 
data alone yields w = —1.08 it 0.10, assuming a flat universe with constant w. The limitation 
of Hq is, of course, that it is a single number at a single redshift, so while it can test any well 
specified dark energy model, it provides little guidance on how to interpret deviations from model 
predictions. However, precision Hq measurements can significantly increase the constraining power 
of other measurements: for our fiducial Stage IV program described in ^ assuming a wq — Wa 
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model for dark energy, a 1% Hq measurement would raise the DETF Figure of Merit by 40%. A 
direct measurement of Hq also has the potential to reveal departures from the smooth evolution of 
dark energy enforced by the wq — Wa parameterization. In essence, the dark energy model transfers 
the absolute distance calibration from moderate redshift BAO measurements down to z = 0, but 
unusual low redshift evolution of dark energy can break this link, shifting Hq away from its expected 
value. A precise determination of Hq, coupled to a w{z) parameterization that allows low-redshift 
variation, could reveal recent evolution of dark energy and definitively answer the basic question, 
"Is the universe still accelerating?" 



7.2. Redshift- Space Distortions 

As discu ssed iii ^2.11 peculiar velocities make large scal e galaxy clustering anisotropic in red- 



shift space (iKaiserl. 119871). Measuring thi s anisotropy (e.g., ICole et al.lll995l : iPeacock et al.l 12001 



Hawkins et al. 2003 : Okumura et al. 20081 ) constrains (3 = f /bg, where / is the logarithmic growth 



rate of fluctuations (eq. [T5]) and bg is the galaxy bias factor (see eq. H3|) . The real space galaxy cor- 
relation function at redshift z (measured, e.g., from the projected correlation function) constrains 
bgas{z), so the combination of real space clustering and redshift-space distortion (often abbreviated 
RSD) can be used to measure f{z)a8{z), the product of the matter clustering amplitude and the 
growth rate. More generally, one can construct estimators for f{z)as{z) directly from the depen- 
dence of the red shift-space galaxy power spectrum or correlation function on angle relative to the 
line of sig ht fsee lPercival and Whitell2009l . who also provide a clear review of the physics of redshift- 
space distortions and recent theoretical developments). Anisotropy of clustering in galaxy redshift 
surveys thus offers an alternative to weak lensing and cluster abundances as a tool for measuring 
the growth of structure. While WL and clusters constrain the amplitude of matter clustering and 
yield growth rate constraints from measurements at multiple redshifts, redshift-space distortions 
directly measure the rate at which structure is g rowing at the redsh ift of observation. Recent ob- 
servational analyses include the measurement of Guzzo et al. ( 20081 ) from the VIMO S -VLT Deep 
Survey (VVDS), f{z) = 0.91 ± 0.36 at z 0.8, the measurement of Samushia et al. ( 201ll ) from 
SDSS DR7, obtaining ^ 10% constraints on f{z)a8{z) at z = 0.25 and z = 0.37, and the mea- 
surement of iBlake et al.l (|201ll ) from the WiggleZ survey, obtaining ~ 10% constraints in each of 
four redshift bins from z = 0.1 to z = 0.9. All of these measurements are consistent with ACDM 
predictions. 

Redshift-space distortions can be measured with much higher precision from future redshift sur- 
veys designed for BAO studies. These measurements can improve constraints on dark energy models 
assuming GR to be correct, and they can be used to constrain (or reveal) departures from GR by 
testing consistency of the growth and expansion histories. The key challenge in modeling redshift- 
space distortions is accounting for non-linear effects, including non-linear or scale-dependent bias 
between galaxies and matter, at the level of accuracy demanded by the measurement precision. 
The linear theory formula (I43D is a poo r approximation ev en on scales of 50 h'^ Mpc or more 



( Cole et al. . 1994 : Hatton and Cole . 1998 : Scoccimarrol . 20041 ) . with the strongest non-linear effect 
being the "finger-of-God" (FoG) distortions in collapsing and virialized regions, which are opposite 
in sign from the linear theory distortions. Their effects are commonly modeled by adding an inco- 
herent small scale velocity dispersion to the linear theory distortions, but this model is physicall y 
incomplete, and it typically leaves 5—10% systematic errors in /3 estimates ( Hatton and Cole . 199 8!). 
High er order perturbation theory can be used to refine the large scale predict i ons (IScoccimarrol. 



20041) . but this does not capture the small scale dispersion effects. [Tinker et al.l (j2006l ) and 



Tinker 



(|2007l ) advocate an approach based on halo occupation modeling, which has the virtue of adopting 
an explicit, self-consistent physical description that can encompass linear, quasi-linear, and fully 
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non-linear scales. However, the model is complicated, and it must be implemented either numeri- 
cally or by using numer ically calibrated fitting formulas that may not generalize to all cosmologies. 
Following similar lines, iReid and Whitd (j201l|) present a simpler and more fully analytic scheme 
for computing redshift-space clus tering of halos , whic h may prove sufficiently accurate for the large 
scales probed by future surveys. iHikage et al.l (j201lh suggest using galaxy-galaxy lensing to esti- 
mate the radial distribution of tracer galaxies in their dark matter haloes; this combined with the 
virial theorem should predict the FoG profile. 

Since the number of Fourier modes in a 3-dimensional volume increases as k^, the precision of 
clustering measurement is generally higher on smaller scales, at least until one hits the shot noise 
limits of the tracer population. Forecasts of cosmological constraints from RSD remain uncertain 
because it is not clear how small a scale and how high a precision one can go to before being limited 
by theoretical modeling systematics. However, even with assumptions that appear conservative, 
the prospects look promising. For example, assuming a maximu m k equal to 0.075 Mpc^^ at 
z = and tracking the non- linear scale k^, at higher redshifts, Iwhite et al.l » predict la 
errors on f(z)a8{z) of a few percent per Az = 0.1 redshift bin out to z = 0.6 from the SDSS-III 
BOSS survey, and for a Euclid/ WFIRST-like s urvey they predict errors pe r Az = 0.1 that drop 
from ~ 1% at z = 0.8 to ~ 0.2% at z = 1.9. iMcDonald and SeljakI toO^ ) show that analyzing 
multiple galaxy populations with distinct bias factors in the same volume can partly circumvent 
the limits usually imposed by sa mple variance, an idea that is incorporated into White et al.'s 
forecasts. iReid and Whitd (j201ll ) examine BOSS RSD forecasts in more detail, considering the 
impact of modeling uncertainties. They forecast a la error on f{z)as{z) at z = 0.55 of 1.5% using 
correlation function measurements down to a comoving scale Smin = 10 Mpc, rising to 3% if the 
minimum scale is Smin = 30 Mpc. (The corresponding wavenumber scale is /cmax ~ l-157r/smin-) 
These forecasts assume marginalization over a nuisance parameter a characterizing the small scale 
velocity dispersion. They improve by a factor of ~ 1.5 if cr is assumed t o be known perfectly, 
demonstrating the potential gains from a method (like that of Tinker 20071 ) that can use smaller 
scale measurements to pin down the impact of velocity dispersions. 

At the percent level there is another potential systematic error in RSD if the selection function 
has an orientation dependence (e.g., due to fiber aperture or self-extinction by dust in the target 
galaxy) and galaxies are ali g ned b y the large-scale tidal field. This exactly mimics RSD, even 



in the linear regime (jHiratal . boO^ ). but fortunately the effect seems to be negligible for present 
surveys. Orientatio n-dependent selection is predicted to be a larger eff^ect for high-z Lya emitters 
(|Zheng et all . l201lh . since there the radiation can resonantly scatter in the IGM and must make 
its way out through the large-scale velocity fiows surrounding the galaxy; at very high redshift 
{z = 5.7) simulations predict an order unity eff^ect. The implications for Lya emitters at more 
modest redshift will become clear with the HETDEX survey. 

In §8.5.31 we show that our fiducial Stage IV program constrains as{z)f{z) to a la precision of 
2% at z = 0.5 and 1% at z > 1 if we assume a wq — Wa dark energy model with Gg and A7 as 
parameters to describe departures from GR. Thus, RSD measurements with this level of precision 
or better can significantly improve the figure-of-merit for dark energy constraints and sharpen tests 
of GR, even in a combined program that includes powerful weak lensing constraints. Much weaker 
RSD measurements could still make a significant contribution to Stage HI constraints. Forecasts 
of the contribution of redshift-space distortions to constraints from specific Stage IV experiments 
(a BigBOSS-like grou nd-based survey a nd a WFIRST-like space-based survey) are presented by 
Stril et al.l (I2OI0I ) and IWang et~ID (|20inl V 
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Figure 29 Evolution of the parameter combination H{z)Da{z) constrained by the Alcock-Paczynski 
test, for the same suite of CMB-normahzed models shown in Figure [21 



7.3. The Alcock-Paczynski Test 

The translation from angular and redshift separations to comoving separations depends on 
Da{z) and H{z), respectively. Therefore, even if peculiar velocities are negligible, clustering in 
redshift space will appear anisotropic if one adopts an incorrect cosmological mod el — s pecifically, 
one with an incorrect value of the product H{z)Da{z). Alcock & Paczynski (jl979l : hereafter 
AP) proposed an idealized cosmological test using this idea, based on a hypothetical popula- 
tion of intrinsically spherical galaxy clusters. The AP test can be implemented in practice by 
using the amplitude of quasar or galaxy clus t ering to identify equ i valen t scales in the a ngular 

1998.: 



and redshift dimension s ( Ballinger et al. . 19961 : Matsubara and Suto . 19961 : Popowski et al 



Matsubara and Szalavl.l200lh or by us ing anisotropy of clustering in the Lya forest (jHui et al 



199S 



McDonald and Miralda-Escudel . Il999l l. AP measurements provide a cosmological test in their own 



right, and they allow high-redshift distance measurements to be translated into constraints on H{z), 
which is a more direct measure of energy density. 

Figure [29] shows the evolution of H{z)Da{z) for the same set of CMB-normalized models 
shown earlier in Figures [2][H At low redshift, the model dependence resembles that of H{z)/ch 
(lower left panel of Figure[2]), but deviations are reduced in amplitude because of partial cancellation 
between H{z) and Da{z) oc dz'H~^{z'). At high redshift = ±0.01 has a larger impact than 
l + w = ±0.1. Note that negative space curvature (positive Qk) tends to increase Da, but because 
the CMB normalization lowers Qm (see Table [1]) and thus H{z)/Hq, the net effect is to decrease 
H{z)Da{z). In §8.5.21 we show that our fiducial Stage IV program predicts H{z)Da{z) with an 
accuracy of ~ 0.15 — 0.3%, assuming a wq — Wa dark energy model. AP measurements at this level 
could significantly improve dark energy constraints. For Stage III, the predictions are considerably 
weaker, ~ 0.5 - 0.9%. 

Like redshift-space distortion measurements, the AP test is automatically enabled by red- 
shift surveys con ducted for BAO. In practice the two effects must be modeled together (see, e.g., 
Matsubaralliooi ^. and the principal systematic uncertainty for AP measurements is the uncertainty 
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in modeling non- linear redshift-space distortions. At present, it is difficult to forecast the likely 
precision of future AP measurements because there have been no rigorous tests of the accuracy 
of redshift-space distortion corrections at the level of precision reachable by such surveys. If one 
assumes that redshift-space distortions can be modeled adequat ely up to fc ^ k„] th en the potential 
gain from AP measurements is impressively large. For example, I Wang et al. find that using 

the full galaxy power spectrum in a space-based emission-line redshift survey increases the forecast 
value of the DETF FoM by a factor of ~ 3 relative to the BAO measurement alone; this gain is 
not broken down into separate contributions, but we suspect that the largest portion comes from 

Halo occupation methods (see §2.3p provide a useful way of approaching peculiar velocity un- 
certainties in AP measurements. Observations and theory imply that galaxies reside in halos, and 
on average the velocity of galaxies in a halo should equal the halo's center-of-mass velocity because 
galaxies and dark matter feel the same large scale acceleration. However, the dispersion of satellite 
galaxy velocities in a halo could differ from the dispersion of dark matter particle velocities by a 
factor of order unity, and central galaxies could have a disper sion of velocities relative to the halo 
center-of-mass (|van den Bosch et al.l . boosi : IXinker et all . I2OO6I I . To be convincing, an AP measure- 
ment must show that it is robust (relative to its statistical errors) to plausible variations in the halo 
occupation distribution and to plausible variations in the velocity dispersion of satellite and central 
galaxies. Alternatively, the AP errors can be marginalized over uncertainties in these multiple 
galaxy bias parameters, drawing on constraints from the observed redshift-space clustering. 

BAO measurements from spectroscopic surveys in some sense already encompass the AP effect, 
since they use the location of the BAO scale as a function of angular and redshift separations to 
separately constrain Da{z) and H{z). However, the addition of a high-precision AP measurement 
from smaller scales could significantly improve the BAO cosmology constraints. BAO measurements 
typically constrain Da{z) better than H{z) because there are two angular dimensions and only one 
line-of-sight dimension. However, at high redshift H(z) is more sensitive to dark energy than 
Da{z), since H{z) responds directly to u^{z) through the Friedmann equation ([3]) while Da{z) is 
an integral of Hq/H{z') over all z' < z. An AP measurement would allow the BAO measurement of 
Da(z) to be "transf erred" to H{z), thus yielding a better measure of the dark energy contribution. 
Blake et al.l ()2011bl ) have recently implemented a similar idea by using AP measurements in the 
WiggleZ survey — with 10 — 15% precision in each of four redshift bins out to z = 0.8 — to convert 
SN luminosity distances into H{z) determinations. 

The AP test can be implemented with measures other than the power spectrum or correla- 
tion func tion. On e option is to use the angu l ar di stribution of small scale pairs of quasars or 
galaxies ( Phillipps , 1994 : Marinoni and Buzzi . 2010l ). though peculiar velo cities still affect this 



measure, in a redshift- and cosmology-dependent way (jjennings et al.l . I2OIII ). A promising recent 



suggestion is to use the average shape of voids in the galaxy distribution; individual v oids are 
ellipsoidal, but in the abse n ce of peculiar velocities the mean shape should be spherical (jRvdenl . 



199,4 iLavaux and WandeltJ . I2OIOI ). Typical voids are of moderate scale (i? ~ lO/i ^ Mpc) and 



have a large filling factor /, so the achievable precision in a large redshift survey is high if 
the sampling density is sufficient to allow accurate void definition. A naive estimate for the 
error on the mean ellipticity of voids with rms ellipticity erms in a survey volume V is ^ 
erms(/V'/|7r/?3)-i/2 p. Q X 10"^ (erms/0.3) (/F/l /i'^ Gpc3)-V2(/j/io/i-i Mpc)-3/2. Peculiar veloc- 



ities have a small, though not negligible, impact on void sizes and shapes (jLittle et al.l . 11991 



^^Since RSD and AP both depend on modeling the broadband P(k, fj,), it is difficult even in principle to separate 
the two types of constraints in an observational analysis. 
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Rvden and Melottl . Il996l : iLavaux and Wandeltl . l201lh . so one can hope that the uncertainty in 
this impact will be smalh but this hope has yet to be tested. Assuming statistical errors only, 
Lavaux and Wandelt (mil) estimate that a void-based AP constraint from a Euclid-like redshift 
survey would provide several times better dark energy constraints than the BAO measurement from 
the same data set, mainly because the scale of voids is so much smaller than the BAO scale. 

7.4- Alternative Distance Indicators 

In §^ and [J] we have discussed the two most well established methods for measuring the cos- 
mological distance scale beyond the local Hubble flow: Type la supernovae and BAO. These two 
methods set a high bar for any alternative distance indicators. Type la supernovae are highly 
luminous, making them relatively easy to discover and measure at large distances. Once corrected 
for light curve duration, local Type la's have a dispersion of 0.1 — 0.15 mag in peak luminosity 
despite sampling ste llar population s with a wide range of age and metallicity, and extreme outliers 
are apparently rare (|Li et ahLlionl ). Thus far, surveys are roughly succeeding in achieving the \/iV 
error reduction from large samples, though progress on systematic uncertainties will be required to 
continue these gains. The BAO standard ruler is based on well understood physics, and it yields 
distances in absolute units. "Evolutionary" corrections (from non- linear clustering and galaxy bias) 
are small and calculable from theory. 

Core collapse supernovae exhibit much greater diversity than Type la supernovae, which is not 
surprising given the greater diversity of their progenitors. However, Type IIP supernovae, character- 
ized by a long "plateau" in the light curve after peak, show a correlation between expansion velocity 
(measured via spectral lines) and the bolometric luminosity of the plateau phase, making them po- 
ten tially useful as stan dardized candles with ~ 0.2 mag luminosity scatter iHamuv and Pintc^ m 



see iMaguire et al.ll201Q for a recent discussion). Unfortunately, as distance indicators Type IIP su- 
pernovae appear to be at least slightly inferior to Type la supernovae on every score: they are less 
luminous, the scatter is larger, the fraction of outliers may be larger, and they arise in star-forming 
environments that are prone to dust extinction. With the existence of cosmic acceleration now well 
established by multiple methods, we are skeptical that Type IIP supernovae can make a significant 
contribution to refinement of dark energy constraints. 

The door for alternative distance indicators is more open beyond z = 1, the effective limit 
of most current SN and BAO surveys. Gamma-ray bursts (GRBs) are highly luminous, so they 
can be dete cted to much higher redshifts than optica l supernovae; the current record holder is 



at z w 8.2 (Tanvir et al.. 200S: Salvaterra et al 



beamed, but they exhibit correlations (lAmati 



200i 



200S). GRBs are ex t remel y diverse and highly 



Ghirlanda et~ID . bood ) between equivalent 



isotropic energy and spectral properties (such as the energy of peak intensity) or variability. 
These correlations can be used to construct distance-redshift diagrams for those systems with 
red shift measured via spectroscopy of afterglow emission or of host galaxies (e.g., ISchaefed 120071 : 



see 



Demianski and Piedipalumbdl201ll for a recent review and discussion). While GRBs reach to 



otherwise inaccessible redshifts, we are again skeptical that they can contribute to our understand- 
ing of dark energy because of statistical limitations and susceptibility to systematics. It has taken 
detailed observations of many hundreds of Type la supernovae, local and distant, to understand 
their systematics and statistics. The number of GRBs with spectroscopic redshifts is ~ 100, and the 
spectroscopic sample may be a biased subset of the full GRB population because of the requirement 
of a bright optical afterglow or identified host galaxy. 

Quasars are another tool for reachi ng high redshif ts, drawing on empirical correlations between 
line eq uivalent widths and luminosity (IBaldwinl. 119771^ or bet ween luminosity and broad line region 
radius teentz et al. . 20091 ) . For example, Watson et al.l ( 2011 ) have recently proposed reverberation 
mapping (which measures broad line region radius) of large quasar samples to constrain dark 
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energy models. The high redshift quasar population is systematically different (in black hole mass 
and host galaxy environment) from lower redshift cali brators. Wh i le qua sar spectral properties 
appear remarkably stable over a wide span of redshift (jSteffen et all . HqS), one would have to be 
prepared to argue that subtle (e.g., 10% or smaller) changes with redshift were a consequence of 
cosmology rather than evolution. Of course, quasar distance indicators also face the same challenges 
of photometric calibration, k-corrections, and dust extinction that affect supernova studies. 

Radio galaxies have been employed as a standard (or at least standardizable) ruler for distance- 
redshift studies, drawing on empirically tested theore tical models that connect the source size to its 
radio properties (Daly, 1994 : Dalv and Guerra . 20021 ). Analysi s of 30 radi o galax ies out to z = 1.8 



gives results consistent with those from Type la supernovae dPalv et al.l . l2009l ). The number of 



radio galaxies to which this technique can be applied is limited, and t he model assump tions used 
to translate observables into distance estimates are fairly complex (see Daly et al] 20091 §2.1). We 



therefore expect that both statistical and systematic limitations will prevent this method from 
becoming competitive with supernovae and BAO. 

In §8.5.41 we show forecast distance errors for our fiducial Stage III and Stage IV experimental 
programs, presenting a target for alternative methods. If one assumes a wq — Wa model then the 
constraints are very tight, with errors below ~ 0.25% at Stage IV and ~ 0.5% at Stage III. However, 
with a general w{z) model the constraints become much weaker outside the redshift range directly 
measured by Type la SNe or BAO. In particular, our Stage IV forecasts presume large BAO surveys 
at z > 1, and if these do not come to fruition there is much more room for alternative indicators 
at high redshift. 



7.5. Standard Sirens 

Gravitational wave astronomy opens an entirely different rout e to distance measurement, with 
an indicator that is grounded in fundamental physics (Schutz, 19861 ). The basic concept is illustrated 
by considering a nearly Newtonian binary system of two black holes with total mass M and reduced 
mass /u, in a nearly circular orbit at separation a. The gravitational wave luminosity of such a source 
is 

32GV^M^ 



5c^a^ 



(156) 



If one can measure the angular velocty of the orbilj 



UJ 



\J GM/a^, its rate of change due to 



inspiral as the binary loses energy w/w = 96G'^;uM^/5c^a^, and the orbital velocity v = GM/a 
(using relativistic corrections to the emitted waveform), one has enough information to solve for a, 
M, and /i. One can therefore calculate -Lew from the measured observables and compare to the 
measured energy flux to infer distance. In practice, one would need to solve for other dimensionless 
parameters such as the eccentricity and the orientation of the orbit, black hole spins, and source 
position on the sky. The solution is not trivial (!), but gravitational waveforms from relativistic 
binaries encode th is information in higher harmonics and modulation of the signal due to precession 
(jArun et al.l . 120071 ) . Because of the analogy between gravitational wave observations and acoustic 
wave detection, this approach is often referred to as the "standard siren" method. 

There are several practical obstacles to gravitational wave cosmology. First, of course, grav- 
itational waves from an extragalactic source must be detected. The most promising near-term 
possibility is nearby (z ^ 1) neutron star binaries, which should be detected by the ground-based 
Advanced LIGO detector (to start observations in ~ 2014) and upgraded VIRGO detector, and 



The observed frequency of the gravitational wave is 2ll) because the source is a quadrupole, producing two crests 
and two troughs per orbit. 
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which could be used to measure Hq. The space-based gravitational wave detector LISA (possible 
launch in the 2020 decade) is designed to allow high S/N measurements of the mergers of massive 
black holes at the centers of galaxies at z ~ which would enable a full Hubble diagram 

Dl{z) to be constructed. A second complication is that gravitational wave observations yield a 
distance but do not give an independent source redshift. One thus needs an identification of the 
host galaxy, and given the angular positioning accuracy of gravitational wave observations this will 
generally require identification of an electromagnetic transient that accomp anies the gr avitational 
wave burst. Possibilities include GRBs resulting from neutron star mergers ( Dalai et al.l . 2006 ) and 
the opti cal, X-ray, or radio signatures of t he response of an a ccretion disk to a massive black hole 
merger ( Milosavljevic and Phinney . 2005 : Lippai et al. . 20081 ). However, both the event rates and 
the characteristics of the electromagnetic signatures are poorl y understood at present. On e can also 
make identifications statistically using large scale structure ( MacLeod and Hogan . 20081 ). A third 
complication, important for the high S/N observations expected from LISA, is that weak lensing 
magnification becomes a don iinant sou r ce of noise at z > 1, inducing a scatter in distance of several 
percent per observed source (jMarkovid . flQoil : IHoIz and Linded . l2005l : iJonsson et al.l . 120071 ) . By tak- 
ing advantage of the non-Gaussian shape of the lensing scat ter, one can r e duce t he error on the mean 
by a factor ~ 2 — 3 below the naive a /\fN expectation ( Hirata et al.l . l20ld : IShang and Haimanl . 



2011 



so samples o f a few dozen well observed sources could yield sub-percent distance scale errors. 



Nissanke ^t a1.l » forecast constraints on Hq from next-generation ground-based gravita- 



tional wave detectors, including Monte Carlo simulations of parameter recovery from neutron star- 
neutron star and neutron star-black hole mergers. They find that Hq can be constrained to 5% even 
for 15 NS-NS mergers with GRB counterparts and a network of three gravitational wave detectors. 
While the event rate is highly uncertain, tens of events per year are quite possible and could lead 
to percent-level constraints on Hq a decade or so from now. 

It remains to be seen whether standard sirens can compete with other distance indicators in the 
LIGO/VIRGO and LISA era. In the longer run, a futuristic gravitational wave space mission like 
the Big Bang Observer, designed to search for gravitational waves from the inflation epoch, could 
measure the Hubble constant to ~ 0.1 % precision using large numbers of compact-star binaries, 
yielding strong dark energy constraints ( Cutler and Holz . 20091 ). 



7.6. The Lya Forest as a Probe of Structure Growth 

The Lya forest is an efficient tool for mapping structure at z ~ 2 — 4 because each quasar 
spectrum provides many independent samples of the density field along its line of sight. (At lower 
redshifts Lya absorption moves to UV wavelengths unobservable from the ground, and at higher 
redshifts the forest becomes too opaque to trace structure effectively.) The relation between Lya 
absorption and matter density is non-linear and to some degree stochastic. However, the physics 
of this relation is straightforward and fairly well understood, in contrast to the more complicated 
processes that govern galaxy formation. We have previously discussed the Lya forest as a method 
of measuring BAO at z > 2, which requires only that the forest provide a linearly biased tracer of 
the matter distribution on ~ 150 Mpc scales. However, by drawing on a more detailed theoretical 
description of the forest, one can use Lya flux statistics to infer the amplitude of matter fluctuations 
and thus measure structure growth at redshifts inaccessible to weak lensing or clusters. 

The Lya forest is d escribed to surprising ly good accuracy by the Fluctuat i ng Gunn-Pet erson 
Approximation (F GPA, Weinberg et al. 1997i : see also Gunn and Peterson! 1965 : Ranch et al. 1993; 
Croft et al] 19981 ). which relates the transmitted flux F = exp(— TLya) to the dark matter over- 
density p/p, with the latter smoothed on approximately the Jeans sca le of the diffus e intergalactic 



medium (IGM) where gas pressure supports the gas against gravity (jSchavd . l200ll ). Most gas in 
the low density IGM follows a power-law relation between temperature and density, T = ro(p/p)" 
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with a S. 0-6, which arise s from the competition between photo-ionization heating and adiabatic 
cooling (jKatz et al.l . Il996l : iHui and Gnedinl . 119971 ). The Lya optical depth is proportional to the 
hydrogen recombination rate, which scales as p^T-^-'^ in the relevant temperature range near 10'* K. 
This line of argument leads to the relation 



F = exp(-TLya) ~ exp [-A{p/p) 



2-0.7al 



(157) 



The constant A depends on a comb ination of parameters that are individually uncertain (see 
Croft et alJll998l : IPeeples et ahlbnid ^. and the value of a depends on the IGM reionization his- 
tory, so in practice these parameters must be inferred empirically from the Lya forest observables. 
However, even after marginalizing over these parameters there is enough information in the clus- 
terin g statistics of the f lux F to constrain the shape and ampl itude of the matter power spectrum 
fe.g.. (Croft et alJbooi IViel et alJliooi : iMcDonald et alJbooel ). Lya forest surveys conducted for 
BAO allow high precision measurements of flux correlations on smaller scales, so they have the 
statistical power to achieve tight constraints on matter clustering. 

There are numerous physical complications not captured by equation ()157p . On small scales, 
absorption is smoothed along the line of sight by thermal motions of atoms. Peculiar velocities 
add scatter to the relation between flux and density, though this effect is mitigated if one uses the 
redshift-space p/p in equation ()157p . Gas does not perfectly trace dark matter, so {p/p)ga,s is not 
identical to {p/p)dm- Shock heating and radiative cooling push some gas off of the temperature- 
density relation. All of these effects can be calibrated using hydrodynamic cosmological simulations, 
and since the physical conditions are not highly non-linear and the effects are moderate to begin 
with, uncertainties in the effects are not a major source of concern. 

A more serious obstacle to accurate predictions is the possibility that inhomogeneous IGM 
heating — especially heating associated with helium reionization, which is thought to occur at 
z ^ 3 — produces spatially coherent fluctuations in the temperature-density relation that appear 
as extra power in Lya forest clustering, or rnakes the relation more cornplicate d than the power 
law that is usually assumed ( McQuinn et al. . 2011 : Meiksin and Tittlev . 201 ll ). Fluctuations in 
the ionizing background radiation can also produce e xtra structure in the fo rest, though this effect 
should be small on comoving scales below ~ 100 Mpc (iMcQuinn et al.l . l201lh . On the observational 
side, the primary complication is the need to estimate the unabsorbed continuum of the quasar, 
relative to which the absorption is measured. (In our notation, F is the ratio of the observed flux to 
that of the unabsorbed continuum.) For statistical analysis of a large sample, the continuum does 
not have to be accurate on a quasar-by-quasar basis, and there are strategies (such as measuring 
fluct uations relative to a running mean) for mitigating any bias caused by continuum errors (see, 
e.g. 



Slosar et al.ll201lh . Nonetheless, residual uncerainties from continuum determination can be 



significant compared to the precision of measurements. 

A discrepancy between clustering growth inferred from the Lya forest and cosmological models 
favored by other data would face a stiff burden of proof, to demonstrate that the Lya forest results 
were not biased by the theoretical and observational systematics discussed above. However, com- 
plementary clustering statistics and different physical scales have distinct responses to systematics 
and to changes in the matter clustering amplitude , so it may be possible t o build a convincing case. 
For example^ t he bispectru m j^elbaum et and flux probability dis- 

tribution (e.g., Lidz et al. 20061 ) provide alternative ways to break the degeneracy between mean 
absorption and power spectrum amplitude and to test whether a given model of IGM physics is 
really an adequate description of the forest. Lya forest tests will assume special importance if other 
measures indicate discrepancies at lower redshifts with the growth predicted by GR combined with 
simple dark energy models. Growth measurements at z ~ 3 from the Lya forest could then play a 
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critical role in distinguishing between modified gravity explanations and models with unusual dark 
energy history. Because it probes high redshifts and moderate overdensities, the Lya forest can also 
constrain the primordial power spectrum on small scales that are inaccessible to other methods. 
The resulting lever arm may be powerful for detecting or constraining the scale-dependent growth 
expected in some modified gravity models, as discussed further in the next section. 



7. 7. Other Tests of Modified Gravity 

We have concentrated our discussion of modified gravity on tests for consistency between mea- 
sured matter fiuctuation amplitudes and growth rates — from weak lensing, clusters, and redshift- 
space distortions — with the predictions of dark energy models that assume GR. However, "not 
General Relativity" is a broad category, and there are many other potentially observable signatures 
of modified gravity models. For an extensive r e view of modified gravity theories and observational 
tests, we refer the reader to iJain and Khoury We follow their notation and discussion in 

our brief summary here. 

In Newtonian gauge, the spacetime metric with scalar perturbations can be written in the form 



ds' 



[I + 2^-)^*^ + (1 - 2^)a^{t)dl^ 



(158) 



which is general to any metric theory of gravity. If the dominant components of the stress-energy 
tensor have negligible anisotropic stress, then the Einstein equation of GR predicts that ^' = i.e., 
the same gravitational potential governs the time-time and space-space components of the metric. 
We have made this assumption implicitly in the WL discussion of ^ Anisotropic stress should 
be negligible in the matter-dominated era, and most proposed forms of dark energy (e.g., scalar 
fields) also have negligible anisotropic stress. Therefore, one generic form of modified gravity test 
is to check for the GR-predicted consistency between ^ and ^. For example, if the Ricci curvature 
scalar R in the GR spacetime action S oc J d'^XyJ—gR is replaced by a function f{R), then 
and <I> are generically unequal. In the forecasts of ^ we focus on GR-deviations described by 
the Gq and A7 parameters that characterize structure growth f ^2.2l). but an alternative approach 
parametrizes the ratios o f ^' and $ to their GR-predicted values (see Bean and Tangmatitham 2O10l : 
Daniel and Linder 2O10l . and references therein). The Gg and A7 formulation is well matched to 
observables that can be measured by large surveys, but the potentials formulation is arguably closer 
to the physics of modified gravity. 

The main approach to testing the consistency of ^ and ^ exploits the fact that the gravi- 
tational accelerations of non-relativistic particles are determined entirely by ^ but the paths of 
photons depend on ^ + $. Thus, an inequality of ^' and $ should show up observationally as a 
mismatch between mass distributions estimated from stellar or gas dynamics and mass distributions 
estimated from gravitational lensing. (In typical modified gravity scenarios, it is then the lensing 
measurement that characterizes the true mass distribution.) The approximate agreement between 
X-ray and weak lensing cluster masses already rules out large disagreements between ^ and A 
systematic statistical approach to this test, employing the techniques discussed in ^ could prob- 
ably sharpen it to the few percent level, limited by the theoretical uncertainty in converting X-ray 
observations to absolute masses. To reach high precision on cosmological scales, the most promis- 
ing route is to test for consistency between growth measurements from redshift-space distortions, 
which respond to the non-relativistic p otential ^, and gro w th measurements f rom weak lensing. 
Implementing an approach suggested bv lzhan. et ahl M). keves et a1.l present a form of 

this test that draws on redshift-space distortion measurements of SDSS luminous red galaxies by 
Tegmark et al.l (120061) and galaxy-ga laxy lensing measurements of the same population. The preci- 
sion of the test in iReves et al.l (|2O10l ) is only ~ 30%, limited mainly by the redshift-space distortion 
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measurement, but this is already enough to rule out some otherwise viable models. In the long 
term, this approach could well be pushed to the sub-percent level, with the limiting factors being 
the modeling uncertainty in redshift-space distortions and systematics in weak lensing calib ration. 
Sim ilar tests on the ^kp c scales of elliptical galaxies have been carried out by iBolton et al. I 6006) 
and ISchwab et all (|2010l ) . 

Some modified gravity models allow ^ and $ to depend on scale and/or time, yielding an 
"effective" gravitational constant GNewton — > GNewton{k,t) (where k denotes Fourier wavenumber). 
Scale-dependent gravitational growth will alter the shape of the matter power spectrum relative 
to that predicted by GR for the same matter and radiation content. Precise measurements of the 
galaxy power spectrum shape can constrain or detect such scale-dependent growth. Uncertainties 
in the scale-dependence of galaxy bias may be the limiting factor in this test, though departures 
from expectations could also arise from non-standard radiation or matter content or an unusual 
inflationary power spectrum, and these effects may be difficult to disentangle from scale-dependent 
growth. The lever arm for determining the power spectrum shape can be extended by using the 
Lya forest or, in the long term, redshifted 21cm maps to make small-scale measurements. Time- 
dependent GNewton would alter the history of structure growth, leading to non-GR values of Gq or 
A7, but it could also be revealed by quite different classes of tests. For example, the consistency 
of big bang nucleosynthesis with the baryon density inferred from the CMB r e quires GN e wton at 



t ~ 1 sec to equal the present day value to within ~ 10% dYang et al.l . 1 19791 : [st 



teigman . 



mm). 



Variation of GNewton over the last 12 Gyr would also influence stel lar evolution, and it i s ther efore 
constrained by the Hertszprung-Russell d iagram of star clusters ( degPInnocenti et al. . 19961 ) and 



by helioseismology ( Guenther et al. . 19981 ). 



Departures from GR are very tightly constrained by high-precision tests in the solar system, 
and many modified gravity models require a screening mechanism that forces them towards GR in 
the solar system and Milky Way environment 1^ Screening may be triggered by a deep gravitational 
potential, in which case the strength of gravity could be significantly different in other cosmological 
environments. For a generic class of theories, the value of GNewton would be higher by 4/3 in 
unscreene d environmen ts, allowing order u nity effects (see, for exampl e , the discussion of f{R) 
gravity bv IChiballiooi and DGP gravity bv Eu3l20od ). IChang and Huil (|201lh suggest tests with 
evolved stars, which could be screened in the dense core and unscreened in the diffuse envelope; the 
stars should be located in isolated dwarf galaxies so that the gravitational p otential of the ga laxy, 
group, or supercluster envi ronment does not trigger screening on its own. iHui et al.1 (|20oi) and 
Jain and VanderPlasI (|201lh propose testing for differential acceleration of screened and unscreened 
objects in low density environments (e.g., stars vs. gas, or dwarf galaxies vs. giant galax i es), in 



effect looking for macroscopic and order unity violations of the equivalence principle. iJainI (j2011 



provides a systematic, high-level review of these ideas and their implications for survey experiments, 
emphasizing the value of including dwarf galaxies at low redshifts within large survey programs. 

Evidence for modified gravity could emerge from some very different direction, such as high 
precision laboratory or solar system tests, tests in binary pulsar systems, or gravity wave exper- 
iments. In many of these areas, technological advances allow potentially dramatic improvements 
of measurement precision — for example, the propos ed STEP satellite cou ld sharpen the test of 



the equivalence principle by five orders-of-magnitude ( Overduin et al. . 20091 ). Modified gravity or 



a dark energy field that couples to non-gravitational forces could also lead to time- variation of fun- 



To give one example, the Shapir o delay of radio waves passing near the Sun, measured to agree with GR to five 
decimal places (jBertotti et al.l . [20031 ). is reduced in theories of gravity that contain scalar fields, but the effect could 
be suppressed by scalar self-interaction in dense environments. 
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damental "constants" such as the fine-structure constant a. Unfortunately, there are no "generic" 
predictions for the level of deviations in these tests, so searches of this sort necessarily remain fishing 
expeditions. However, the existence of cosmic acceleration suggests that there may be interesting 
fish to catch. 



1.8. The Integrated Sachs-Wolfe Effect 

On large angular scales, a major cont ribution to CMB anisotropies comes from gravitational 
redshifts and blueshifts of photon energies (| Sachs and Wolfel . [l967. ). In a universe with ^li^t = = 
1, potential fluctuations 5$ ~ G6M/R stay constant because (in linear perturbation theory) 6M 
and R both grow in proportion to a(t). In this case, a photon's gravitational energy shift depends 
only on the difference between the potential at its location in the last scattering surface and its 
potential at earth. However, once curvature or dark energy becomes important, 5M grows slower 
than a{t), potential wells decay, and photon energies gain a contribution from an integral of the 
potential time derivative (^'-|-<I> in the notation of ^7.7p known as the Integrated Sachs- Wolfe (ISW) 
effect. In more detail, one should distinguish the early ISW effect, associated with the transition 
from radiation to matter domination, from the late ISW effect, associated with the transition to 
dark energy domination. The ISW effect depends on the history of dark energy, which determines 
the rate at which potential wells decay. It can also test whether anisotropy is consistent with the 
GR prediction — in particular whether the ^ and ^ potentials are equal as expected. 

As an observational probe, the ISW effect has two major shortcomings. First, it is significant 
only on large angular scales, where cosmic variance severely and unavoidably limits measurement 
precision. (On scales much smaller than the horizon, potential wells do not decay significantly in 
the time it takes a photon to cross them.) Second, even on these large scales the ISW contri- 
bution is small compared to primary CMB anisotropies. The second shortcoming can be partly 
addressed by measuring the cross-correlation between the CMB and tracers of the foreground mat- 
ter distribution, which separates the ISW effect from anisotropies present at the last scattering 
surface. The initial searches, yielding upper limits on Oa, were carried out by cross-correlating 
COBE CMB maps with the X-ray backg round (mostly from AGN, which trace the distribut ion of 
their host galaxies) measured by HE AO ( Boughn et al. . 1998 : Boughn and Crittenden . 20041 ) . The 
WMAP era, combined with the availability of large optical galaxy samples with well-characterized 
redshift distributions, l ed to renewed intere s t in ISW and to the first marginal-significance de- 



2004 : iFosalba and Gaztanagal . l2004l : iNolta et al.l . l2004l ) 



tections (iFosalba et al.l. 120031: IScranton et al.l. 120031: lAfshordi et al.l . l2004l : iBoughn and Crittenden , 



Realizing the cosmological potential of the ISW effect requires cross-correlating the CMB with 
large scale structure tracers over a range of redshifts at the largest achievable scales, and properly 



treating the covariance arising from the redshift range and sky coverage of each data set. iHo et al 



(|2008l ) used 2MASS objects {z < 0.2), photometrically selected SDSS LRGs (0.2 < z < 0.6) and 
quasars (0.6 < z < 2 . 0), an d NVSS radio galaxies, finding an overall detection significance of 3.7c7. 
Giannantonio et al. (j2008! ) used a similar sample (but with a different SDSS galaxy and quasar 
selection, and with the inclusion of the HE AO X-ray background maps) and found a 4.5o" detec- 
tion of IS\y . Both of these m easurements are con s istent with the "standard" ACDM cosmology!^ 
Zhao et al.l (|2O10l l utilize the I Giannantonio et al.l (|2008l ) measurement in combination with other 
dat a to test for late-tiine tran sitions in the potentials $ and , finding consistency with GR. 

Giannantonio et al. ( 20081 ) estimate that a cosmic variance limited experiment could achieve 



a 7 — lOo" ISW detection. Because of the low S/N ratio, the ISW effect does not add usefully to 



'^^The lHo et"al] (|2008l ) measurement is almost 2a above ACDM, but we attribute no special significance to this! 
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the precision of parameter determinations within standard dark energy models, but it could reveal 
signatures of non-standard models. Early dark energy — dynamically significant at the redshift 
of matter-radiation equality — can produce observable CMB changes via the early ISW effect 
(jPoran et all 120071 : Ide Putter et al. 1 [2009 discuss the related problem of constraining early dark 
energy via CMB lensing). Perhaps the most interesting application of ISW measurements is to 
constrain, or perhaps reveal, inhomogeneities in the dark energy density (se e ^2.2p. which produc e 
CMB anisotropics via ISW and are confined to large scales in any case (see Ide Putter eraP boid ) . 
However, it is not clear whether even exotic models can produce an ISW effect that is distinguishable 
from the ACDM prediction at high significance. Measuring the ISW cross-correlation requires 
careful attention to angular selection effects in the foreground catalogs, but these effects should 
be controllable, and independent tracers allow cross-checks of results. Since the prediction of 
conventional dark energy models is robust compared to expected statistical errors, a clear deviation 
from that prediction would be a surprise with important implications. 



7.9. Cross- Correlation of Weak Lensing and Spectroscopic Surveys 

Our forecasts in ^7.21 incorporate an ambitious Stage IV weak lensing program, and in ^8.5.31 
we consider the impact of adding an independent measurement of f{z)a^{z) from redshift-space 
distortions in a spectroscopic galaxy survey, finding that a 1 — 2% measurement can significantly 
improve constraints on the growth-rate parani e ter A7 relative to our f i ducial program. However , 
some recent papers ( Bernstein and Cai . 2011 : Gaztanaga et al. . 2011 : Cai and Bernstein . 201 ll ) 
suggest that a combined analysis of overlapping weak lensing and galaxy redshift surveys can 
yield much stronger dark energy and growth constraints than an after-the-fact combination of 
independent WL and RSD measurements. 

The analysis envisioned in these papers involves measurement of all cross-correlations among 
the WL shear fields (and perhaps magnification fields) in tomographic bins, the angular clustering 
of galaxies in photo-z bins of the imaging survey, and the redshift-space clustering of galaxies in red- 
shift bins of the spectroscopic survey, as well as auto-correlations of these fields. While the forecast 
gains emerge from detailed Fisher-matrix calculations, the essential physics (jBernstein and Cail . 
2011 : Gaztanaga et al. . 201 ll ) appears to be absolute calibration of the bias factor of the spectro- 
scopic galaxies via their weak lensing of the photometric galaxies (galaxy-galaxy lensing, §5.2.6p . 
This calibration breaks degeneracy in the modeling of RSD, and it effectively translates the spec- 
troscopic measurement of the galaxy power spectrum into a normalized measurement of the matter 
power spectrum. While the second technique can also be applied to galaxy clustering in the pho- 
tometric survey, using photo-z's, the clustering measurement in a spectroscopic survey is much 
more precise because there are more modes in 3-d than in 2-d. The cross-correlation approach 
is m ore powerful if the spectros copic survey includes galaxies with a wide range of bias fac- 
tors (|McDonald and Seliakl . lioOfll v e.g., a mix of massive absorption-line galaxies and lower mass 
emission-line galaxies. 

These studies are still in an early phase, and it remains to be seen what gains can be realized in 
practice. Stochasticity in the relation between the galaxy and mass density fields depresses cross- 
correlations relative to auto-correlations, a potentially important theoretical systematic, though 
stochasticity is expected to be small at large scales, and corrections can be computed with halo- 
based models that are constrained by small and intermediate-scale clustering. Conversely, it may 
be possible to realize these gains even when the weak lensing and spectroscopic surveys do not 
overlap, by calibrating the bias of a photometric sam ple that has the same target selection criteria 
as the spectroscopic sample. iGaztanaga et al.l (|2nilh also investigate the possibility of implementing 
these techniques in a narrow-band imaging survey with large numbers of filters, which is effectively 
a low-resolution spectroscopic survey. They find that most of the gains of a spectroscopic survey are 
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achieved if the rms photo-z uncertainty is A2;/(l + z) < 0.0035, while larger uncertainties degrade 
the results. 



7.10. Other Alternatives: Galaxy Ages, Strong Lenses, and Redshift- Drift 

We have described those techniques that we think might become (or remain) competitive with 
the four discussed in §J|3][6l but we have certainly not exhausted the literature on possible probes 
of cosmic acceleration. We conclude this section by briefly summarizing three other methods that 
have gained attention at various times. 

In principle the age of the universe is an observable that can probe cosmic expansion history. 
The integral that determines age, 



t z) 



dz' 

TT~z 



lH~\z') , 



(159) 



is similar to the integral for comoving distance (eq. [7]), except that it extends from z to oo in- 
stead of to z. The conflict between the ages of globular clusters and the value of to in a de- 
celerating universe was one of the significant early arguments for cosmic acceleration, and some 
authors have employed a ges of high-redshift galaxi es as a constraint on dark energy models (e.g., 
Lima and Alcaniz 2000l ). Jimenez and Loebl ( 20021 ) proposed using differential ages of galaxies at 
different redshifts to measure, in effect, H[z). While this approach removes some of the uncer- 
tainties in the population synthesis models, it relies on identifying a population of galaxies at one 
redshift that is just an aged version of a population at a higher redshift. Observational studies thus 
far have concentrate d on the ma s sive e llipticals, as these have the fewest complications (dust, ongo- 
ing star formation). Stern et al. ( 2010l ) use spectroscopic observations across the rest-frame optical 
and UV (to break the age-metallicity degeneracy) to measure H{z = 0.5) = 97 it 62 km s~^ Mpc~^ 
and H{z = 0.8) = 90 ± 40 km s~^ Mpc~^ However, for precision studies the method faces a range 
of possible complications, including low-level star formation, accretion of other stellar populations, 
and complexity of the original stellar population (e.g., a range of metallicities). We do not presently 
see a path to competitive (few percent or better) cosmological constraints from this method. 

Strong gravitational lenses can be employed to measure expansion history in a variety of ways. 
One is to use angu lar positions of rn ultiple sources with different (known) redshifts to obtain 
distance ratios (e.g., Soucail et al. 20041 ) . a strong- lensing version of the cosmography test discussed 
in ^5.2.71 With time delay me asurements one ca n infer Hq, or more generally the angular diameter 
distance relation Da{z) (e.g., Suyu et aP 2010l ). The critical systematic for both approaches is 
uncertainty in the mass distribution of the lens, which enters predictions of the angular positions 
and time delays. When the sources are resolved (i.e., galaxies rather than AGN in the optical, or 
jets mapped with VLBI), or when the multiplicity of images is unusually high, or when additional 
dynamical data (such as the stellar velocity dispersion) are av ailable, then rep roducing the data 



places more stringent constraints on the mass distributions (e.g.. lSuyu et al.ll201ll ). Even in the best 



cases, however, we suspect it will be difficult to achieve percent-level accuracy for any given lens. 
Possibly the uncertainties in mass distributions can be parameterized and effectively converted 
to statistical errors by marginalization. If these errors are uncorrelated from one system to the 
next, then careful measurements and modeling of a number of well constrained lenses could lead 
to high aggregate precision. The number of multiple-image lenses as a function of separation and 
redshift (or giant function of curvature radius and reds hift) offers anothe r cosmological 



test that depends on both geometry and structure growth (e.g., lOguri et al.l l2008l : iHoresh et al 



20 111 ). However, this test relies on the statistics of all strong lenses, not just the ones selected to 
have well constrained mass models. While some level of "self-calibration" is possible, we think it 
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unlikely that systematic uncertainties in the evolution of mass distributions can be controlled well 
enough to make this approach competitive. 

The redshift of a comoving source changes as the universe expands. Sandage ( 19621 ) was the first 
to propose this "redshift drift" as a cosmological test, but he n oted that it appeared far beyond the 
capabilities of existing experimental techniques. Loeb ( 19981 ) repopularized the idea, noting that 
high-resolution spectrographs on large telescopes could potenti ally measure the effect in absorption- 
line spectra of high-redshift quasars. iQuercellini et al.l (|20ld l provide an extensive review of the 
redshift-drift method and other forms of "real-time cosmology" experiments. The expected change 
in redshift over a time interval At, expressed as an apparent velocity shift, 
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which vanishes for a coasting universe with H{z)/Ho = (1 + z)"^ For our fiducial cosmological 
model, the predicted change over a At = 10 year observation al span is -1-1.32 c i n s~ ^, —1.21 cm s~^ 



and —3.66 cm s ^ for sources at z = 2, 3, and 4, respectively. ICorasaniti et al.l (|2007l ) estimate that 



observations of 240 quasars over a span of 30 years using the CODEX spectrograph proposed for 
the European Extremely Large Telescope could meas ure H(z) over z = 2 — 5 with an aggregate 
precision of ~ 2% (see also Balbi and Quercellini 200?! ). The BAO component of the fiducial Stage 
IV program that we present in ^8.11 (which assumes 25% sky coverage and errors that are 1.8 times 
those of linear theory sample variance) yields errors of 0.6 — 0.7% in H{z) per bin of 0.07 in ln(l -|- z) 
at z = 2 — 3 (see Table [6]). The fiducial BAO program would thus be substantially more powerful 
than the redshift-drift approach, but it is not yet clear that high-redshift BAO surveys of this 
volume will prove practical. 
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8. A Balanced Program on Cosmic Acceleration 



Having discussed many observational metliods individually, we now turn to what we might hope 
to learn from them in concert. To the extent that this report has an underlying editorial theme, it 
is the value of a balanced observational program that pursues multiple techniques at comparable 
levels of precision. In our view, there is much more to be gained by doing a good job on three or 
four methods than by doing a maximal job on one at the expense of the others. This is not a "try 
everything" philosophy — moving forward from where we are today, an observational method is 
interesting only if it has reasonable prospects of achieving percent- or sub-percent-level errors, both 
statistical and systematic, on observables such as H{z), D{z), and G{z). The successes of cosmic 
acceleration studies to date have raised the field's entry bar impressively high. 

A balanced strategy is important both for cross-checking of systematics and for taking advantage 
of complementary information. Regarding systematics, the next generation of cosmic acceleration 
experiments seek much higher precision than those carried out to date, so the risk of being limited or 
biased by systematic errors is much higher. Most methods allow internal checks for systematics — 
e.g., comparing distinct populations of SNe, measuring angular dependence and tracer dependence 
of BAO signals, testing for S-modes and redshift-scaling of WL — but conclusions about cosmic 
acceleration will be far more convincing if they are reached independently by methods with different 
systematic uncertainties. Two methods only provide a useful cross-check of systematics if they have 
comparable statistical precision; otherwise a result found only in the more sensitive method cannot 
be checked by the less sensitive method. 

Regarding information content, we have already emphasized the complementarity of SN and 
BAO as distance determination methods. SN have unbeatable statistical power at z < 0.6, while 
BAO surveys that map a large fraction of the sky with adequate sampling can achieve higher preci- 
sion at z > 0.8. Overlapping SN and BAO measurements provide independent physical information 
because the former measure relative distances and the latter absolute distances ( Mpc vs. Mpc), 
and the value of h is itself a powerful dark energy diagnostic in the context of CMB constraints 
(see §7.11 and §8.5.ip . WL, clusters, and redshift-space distortions provide independent constraints 
on expansion history, at levels that can be competitive with SN and BAO, and they provide sensi- 
tivity to structure growth. Without structure probes, we would have little hope of clues that might 
locate the origin of acceleration in the gravitational sector rather than the stress-energy sector, and 
we would, more generally, reduce the odds of "surprises" that might push us beyond our current 
theories of cosmic acceleration. 

The primary purpose of this section is to present quantitative forecasts for a program of Stage IV 
dark energy experiments and to investigate how the forecast constraints depend on the performance 
of the individua l components of such a program. Our forecasts are analogous to those of the DETF 
(|Albrecht et al.l . l2006l ^. updated with a more focused idea of what a Stage IV program might 



look like, and updated in light of subsequent work on parameterized models and figures of merit 
for dark energy e xperiments, most dir ectly that of the JDEM Figure-of- Merit Science Working 
Group fFoMSWG: lAlbrecht et al]l2009l l. In ^8.11 we summarize our assumptions about the fiducial 
program. In §8.21 we describe the methodology of our forecasts, in particular the construction of 
Fisher matrices for the fiducial program. In ^8.3l we present results for the fiducial program and for 
variants in which one or more components of this program are made significantly better or worse. 
We also compare these results to forecasts of a "Stage III" program represented by experiments 
now underway or nearing their first observations. 

We have elected to focus on SN, BAO, and WL as the components of these forecasts, for two 
reasons. First, it is more straightforward (though still not easy) to define the expected statistical 
and systematic errors for these methods than for others. Second, the most promising alternative 
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methods — clusters, redshift-space distortions, and the Alcock-Paczynksi effect — will be enabled 
by the same data sets obtained for WL and BAO studies. It is therefore reasonable to view these 
as auxiliary methods that may improve the return from these data sets (perhaps by substantial 
factors) rather than as drivers for the observational programs themselves. In § §8.41 and 18.51 we 
present forecasts for how well the fiducial CMB+SN+BAO+WL programs predict the observables 
of these and other alternative methods, providing a target for how well they must perform to add 
new information beyond that in our primary probes. In some cases we find that plausible levels of 
performance could substantially improve tests of cosmic acceleration models. 

8.1. A Fiducial Program 

As discussed in §1.31 Astro2010 and the European Astronet report have placed high priority on 
ground- and space-based dark energy experiments. The Stage III experiments currently underway 
will already allow much stronger tests of cosmic acceleration models, and Stage IV facilities built 
over the next decade should advance the field much further still. Our Stage IV program corresponds 
roughly to the goals recommended by the Cosmology and Fundamental Physics panel report of 
Astro2010. 

For SN studies, we anticipate that Stage IV efforts will be limited not by statistical errors but 
by systematics associated with photometric calibration, dust extinction, and evolution of the SN 
population. For our fiducial program, we assume that SN surveys will achieve net errors (statistical 
+ systematic) of 0.01 mag for the mean distance modulus in each of three redshift bins of width 
Az = 0.2 extending from z = 0.2 to a maximum redshift Zmax = 0.8 (see discussion in ^3.4p . We 
also assume the existence of a local SN sample at z = 0.05 with the same 0.01 mag net error. High 
quality observations could yield a smaller systematic error in the local sample, but we suspect that 
the most challenging systematic for this local calibration will be transferring it to the more distant 
bins. We treat the bin-to-bin errors as uncorrelated, though this is clearly an approximation to 
systematic errors that are correlated at nearby redshifts and gradually decorrelate as one considers 
differing redshift ranges and observed- frame wavelengths. Even with 0.15 mag errors per SN, 
achieving this level of statistical error requires only 225 SNe per bin, and we expect that the error 
per SN can be reduced by working at red/IR wavelengths and by selecting sub-populations based 
on host galaxy type, spectral properties, and light curve shape. For purely ground-based efforts, we 
consider our 0.01 mag floor for systematic errors to be somewhat optimistic, given the challenges 
of dust extinction corrections and photometric calibration. However, a space-based program at 
rest-frame near-IR wavelengths, enabled by WFIRST, could plausibly achieve better than 0.01 
mag systematics. We suspect that it will be hard to push calibration and evolution systematics 
below 0.005 mag even with WFIRST, and pushing statistical errors below this level begins to place 
severe demands on spectroscopic capabilities, unless purely photometric information can be used 
to identify populations with scatter below 0.1 mag per SN. We consider the impact of increasing 
-Zmax beyond 0.8, but the power of the SN program depends much more strongly on the magnitude 
error than on the maximum redshift. 

For BAO, the primary metric of statistical constraining power is the total comoving volume 
mapped spectroscopically with a sampling density high enough to keep shot-noise sub-dominant. 
There are several projects in the planning stages that could map significant fractions of the comoving 
volume available out to z ~ 3. These include the near-IR spectroscopic components of Euclid and 
WFIRST, ground-based optical facilities such as BigBOSS, DEspec, and SuMIRe PES, and radio 
intensity-mapping experiments (see ^4.7p . For our fiducial program, we assume that these projects 
will collectively map 25% of the comoving volume out to z = 3, with errors a factor of 1.8 larger 
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than the hnear theory sample variance errorso We specifically assume full redshift coverage from 
z = — 3 with /sky = 25% sky fraction, but other combinations of redshift coverage and /sky that 
have the same total comoving volume yield similar results. The factor 1.8 accounts for imperfect 
sampling (hence non-negligible shot-noise) and for non-linear degradation of the BAO signal. It 
approximates the effects of sampling with nP = 2 and using reconstruction ( ^4.3.3p to remove 50% 
but not 100% of the non-linear Lagrangian displacement of tracers. We implicitly assume that 
theoretical systematics associated with location of the BAO peak will remain below this level, an 
assumption we consider reasonable but not incontrovertible based on the discussion in §4.51 

For WL, the primary metric of statistical constraining power is the total number of galaxies 
that have well measured shapes and good enough photometric redshifts to allow accurate model 
predictions and removal of intrinsic alignment systematics. For our fiducial case, we assume a sur- 
vey of 10^ deg^ achieving an effective surface density of 23 galaxies per circmin with. -2^med — 

0.84, 

corresponding to /ab < 25 and > 0.25". The effective galaxy number is 8.3 x 10^. Euclid 
can likely achieve a higher surface density, though perhaps with a smaller survey aresT^. WFIRST 
could plausibly reach this surface density over a somewhat smaller survey area with a 2 — 2.5 year 
WL campaign. LSST will survey a larger area, and it might or might not achieve this effective sur- 
face density, depending on how low a value of rgg/rpsp it can work to before shape measurements 
are systematics dominated. We compute constraints from cosmic shear in 14 bins of photometric 
redshift and from the shear-ratio test described in §5.2.71 but we do not incorporate higher or- 
der lensing statistics or galaxy-shear cross-correlations. We include information up to multipole 
^max = 3000, beyond which statistical power becomes limited at this surface density and systematic 
uncertainties associated with non-linear evolution and baryonic effects become significant. 

Forecasting the systematic uncertainties in Stage IV WL experiments is very much a shot in 
the dark. Systematic errors are already comparable to statistical errors in surveys of 100 deg^, so 
lowering them to the level of statistical errors in a 10^ deg^ survey that has higher galaxy surface 
density requires more than an order of magnitude improvement. We therefore consider a "fiducial" 
and an "optimistic" case for WL systematics. For the fiducial case, we incorporate (and marginalize 
over) aggregate uncertainties of 2 x 10"'^ in shear calibration and 2 x 10~^ in the mean photo-z, 
with errors in each redshift bin larger by \/T4 but uncorrelated acro ss bins. We also incorporate 
intrinsic alignment uncertainty as described by Albrecht et al. (l2009l . §2h of Appendix A), which 
includes marginalization over both GI and II components (see ^5.6.ip . For our "optimistic" case 
we adopt no specific form of the systematic errors but simply assume that they will double the 
statistical errors throughout. At an order of magnitude level, we can see that the optimistic case 
corresponds to a global fractional error a ~ ^A^^^^^^ ~ '^fs]^^'^^rnLyi — 1-3 x 10~^, significantly lower 
than the fiducial case assumption of 2 x 10~^ errors for shear and photo-z calibration (which, roughly 
speaking, combine in quadrature to make a 2.8 x 10~^ multiplicative uncertainty). However, at 
scales and redshifts where the statistical errors are large, multiplying them by two can be a larger 
change than adding the shear-calibration and photo-z systematics. As a result, there will be some 
measures (e.g., the error on 0^) for which our "optimistic" program performs slightly worse than 
our fiducial program. Of course, WL experiments that achieved the statistical limits of several 
xlO^ source galaxies — possible in principle — would be several times more powerful than even 
our optimistic scenario. 



This is equivalent to assuming linear theory sample variance over a fractional volume 25%/1.82 = 7.7%. 
'^°At least if exposures are deep enough to reach the 25a detection threshold we think is necessary to achieve 
accurate shape measurements, see i]5.5l 
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8.2. Forecasting Constraints 

The fiducial program outlined above provides a baseline for evaluating improvement in the 
determination of the cosmological parameters relative to current constraints. We use a Fisher 
matrix analysis to quantify this improvement and to study the complementarity of the main probes 
of cosmic acceleration. Since our knowledge of the exact design of future surveys and the systematic 
errors they will face is inherently imperfect, we also consider the effect of varying the precision of 
each technique in our forecasts, including both pessimistic and optimistic cases for SN, BAO, and 
WL data. 

Quantifying the impact of each probe on our understanding of cosmic acceleration requires 
metrics for evaluating progress. The precision with which the dark energy equation of state (and 
its possible time dependence) can be measured is a common choice; while not the only quantity 
of interest, it is clearly a central piece of the puzzle. One of the main quantities we use below is 
the DETF figure of merit defined in equation (p6]) . FoM = [a{wp)a{wa)]~^ ■ The FoM indicates 
how well an experiment determines the dark energy equation of state parameter and its derivative 
dw/da at the pivot redshift Zp, and it thereby indicates the ability to detect deviations from the 
standard ACDM model with Wp = —1 and Wa = 0. 

While the DETF FoM is relatively simple to evaluate for a particular experiment, it omits 
much of the information that will be available from future experiments, including some potentially 
important clues to the nature of cosmic acceleration. For example, the true dark energy dynamics 
may be considerably more complicated than what the two-parameter linear model can accommo- 
date, so that constraints on wq and Wa may yield incomplete or misleading results. Additionally, 
the equation of state alone is insufficient to describe the full range of possible alternatives to the 
standard cosmological model. For example, modified gravity theories can mimic the effect of any 
particular equation of state evolution on the Hubble expansion rate and the distance - redshift re- 



lation while altering the rate of growth of large-scale structure (e.g., iLue et al.l l2004l : ISong et al 



20071 ). Including such possibilities requires extra parameters that describe changes in the growth 
history that are independent of equation of state variations, as discussed in §2.21 Other standard 
parameters of the cosmological model, such as the spatial curvature and the Hubble constant, are 
important due to degeneracies with the effects of cosmic acceleration that can limit the precision 
of constraints on the dark energy equation of state. 

To include more general variations of the equation of state as well as altered growth of struc- 
ture from modi f icatio ns to GR on large scales, we adopt the JDEM FoMSWG parameterization 
(|Albrecht et al.l . l2009l ). The equation of state in this parameterization is allowed to vary indepen- 



dently in each of 36 bins of width Aa = 0.025 extending from the present to a = 0.1 (z = 9). 
Specifically, the equation of state has a constant value of Wi at (1 — 0.025z) < a < [1 — 0.025(z — 1)], 
for i = 1, . . . , 36. At earlier times, the equation of state is assumed to be it; = —1, although the 
impact of this assumption is typically quite small since dark energy accounts for a negligible frac- 
tion of the total density at z > 9 in most models. Modifications to the linear growth function of 
GR Ggr(^;) are included through the parameters Gg and A7 as defined in equations (05]) and ()i6|) . 
These parameters describe the change relative to GR in the normalization of the growth of struc- 
ture at z = 9 and in the growth rate at z < 9, respectively. Adding these to the binned Wi values 
and the standard ACDM parameters, the full set is 

p = (t(;i, . . . , t(;36,lnG9, A7,r2m/i^,Ofe/i^,ilfc/i^,ri0/i^,ln^s,"-s, AA^) , (161) 

where the primordial amplitude Ag is defined at k = 0.05 Mpc~^. AA4 is an overall offset in 
the absolute magnitude scale of Type la supernovae. The Hubble constant is determined by these 
parameters through /i^ = fi^/i^ -t-O^/i^ -1- 0,^/1^. We compute our forecasts at the fiducial parameter 
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Table 5. Fiducial Model for Forecasts 
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values chosen by the F oMSWG to match CMB constraints from the 5-year release of WMAP data 
(|Komatsu et al.l . I2OO9I ): ;hese are listed in Table El These parameters are similar but not identical 



to those of the model used in ^ (Tabled]), which is based on WMAP7. Note that spatially flat 
ACDM and GR are assumed for the fiducial model. 

We use a Fisher matrix analysis to estimate the constraints on these parameters from the fiducial 
program defined in §8.11 and its variations. The Fisher matrix for each experiment consists of a 
model of the covariance matrix for the observable quantities and derivatives of these quantities with 
respect to the parameters. We compute the latter numerically with finite differences and confirm 
the results using analytic expressions when possible. 

We model SN data as measurements of the average SN magnitude in each of several redshift bins 
and in a low-redshift calibration sample. While our fiducial case assumes that the net magnitude 
error is uncorrelated from one bin to the next, we also consider the impact of including a correlated 
component of the error by defining the SN covariance matrix as 

^2 



" ^ <u (M) ^o., + a^,. exp (-^) , a > 1 and /? > 1 , ^^^'^ 



where Az is the bin width, am,u is the uncorrelated error in a bin of width = 0.2 (or in the local 
sample at redshift 21), am,c is the correlated error with correlation length Azc, and the net error in 



each bin Za {a > 1) is am = \J (^m,u + <^m,c • general these errors are redshift dependent, but here 
we assume that they are constant for simplicity. We do not consider possible correlations between 
the local SN sample and the high-redshift bins. For the fiducial forecasts we take o"m,,c = 0, so the 
covariance matrix is diagonal. The SN Fisher matrix is then computed as a sum over redshift bins 

-2.^^(^-/3) (163) 

where m{za) = b\og[HQ{D ^{za))] + is the average magnitude in the bin and the derivatives are 
taken with respect to the parameters of equation ()16ip . 

For BAO we divide the observed volume into bins of equal width in ln(l + z), assumed to be 
uncorrelated, and compute the Fisher matrix 

^BAO ^ ^ djjA^ [CBAO(..)]- ^ , (164) 



where r{za) = {D{za)/rs, H{za)rs) a nd r.^ is th e sound horizon at recombination (see §2.3p . for 
which we use the fitting formula from Hu ( 20051 ). 
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Table 6. BAO Errors for the Fiducial Program 
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Note. — Column 3 gives the volume of the redshift slice 
for /sky = 0.25. In all redshift slices, errors on D /vg and Hrg 
are correlated with correlation coefficient r = 0.409. 
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We es timate the covariance matrix in each redshift bin using the BAO forecast code bv lSeo and Eisenstein 



which provides estimates of the fractional error on distance and the Hubble expansion rate 



at each redshift (relative to r^), ) = J C^^'^ /{D /rg) and (T\,^[Hrs) — y ^22'''' /{Hrg), respec- 



tively, as well as the cross correlation r = Cf^^^ / ^ Cf^^^Cg^^ . For our default forecasts, we start 
with the linear theory cosmic variance predictions, corresponding to the limit of perfect sampling 
of the density field within the observed volume and no degradation of the signal due to nonlinear 
effects. To approximate the effects of finite sampling and nonlinearity, we increase these errors by 
a factor of 1.8 for our fiducial forecasts, which leads to parameter constraints comparable to what 
would be expected with sampling nP = 2 and reconstruction that halves the effects of nonlinear 
evolution. In Table [U] we list the volume for /g^y = 0.25 and fiducial BAO covariance matrix ele- 
ments for 20 redshift slices from < z < 3. The results we obtain are only weakly dependent on 
the number of redshift bins chosen to divide up the total volume. 

The WL Fisher matrix is based on the methodology described by Albrecht et al. ( 20091 ). where 



the explicit formulas are given. It includes both power spectrum tomography and cross-correlation 
cosmography (redshift scaling of the galaxy-galaxy lensing signal), but makes no assumption about 
the galaxy bias. The galaxies are sliced into Nz = 14 redshift bins and we consider power spectra 
in A'^ = 18 bins logarithmically spaced over 10 < £ < 10^. We consider all power spectra and 
cross-spectra of the galaxies gi and the £'-mode shear jf. This leads to 2Nz scalar fields on the 
sky, and hence A^2pt = '2Nz{2Nz -|- l)/2 x bins in the power spectrum matrix0 The length A^2pt 
vector C of power spectra incorporates all 2-point information. 

Our task is now to construct a model both for C and for its covariance matrix S, and then to 
construct the Fisher matrix for parameters p: 



= ^S-i — , (166) 



where denotes a matrix transpose. Systematic errors may be incorporated as either nuisance 
parameters p (marginalized over some prior) or as additional contributions to S: 

(167) 

OW OVJ 

where w is the amplitude of some systematic and a-^ is the amount over which it is marginalized. 
We incorporate in S the following contributions: 

• The Gaussian covariance matrix. 



The 1 -halo contribution to the shear 4-point function, given by Eq. (A9) of lAlbrecht et al, 
(|2009l ). 



Galaxy bias and stochasticity, fully marginalizecQ in each bin of i and z. 

The // intrinsic alignment term, obtained by fully marginalizing out the shear auto-correlations 
in each redshift slice. 



Since we neglect magnification bias, some of these spectra, e.g. tlie correlation of high-redshift galaxies with 
low-redshift shear, are zero for all cosmological models. 

^^i.e. with sufficiently wide prior that no significant information remains. 
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The GI intrinsic alignment term. It is assumed that Eq. (jl38p will allow estimation of Pesik) 
and removal of this term in the linear and weakly nonlinear regimes (taken to be •£ < 10^'^ ). 
At smaller scales, we impose a weak prior that the GI not exceed present upper limits. This 
is implemented as 

Psik) 

where the squar e root is introduced to prevent many bins from being used to "average down" 
this systematic ( Albrecht et al. . 20091 ) 



0.003jiV^,nonlin(A^. -1), (168) 



The photometric redshift errors (one bias parameter for each bin) and shear calibration errors (also 
one bias parameter for each bin) are treated as nuisance parameters in the parameter vector p and 
are marginalized out before combining with other cosmological probes. 

The forecasts for the main SN, BAO, and WL probes are supplemented by the expected con- 
straints from upcoming CMB measurements provided by the Planck satellite. We adopt the Fisher 
matrix F*""^^ constructed by the FoMSWG, which includes cosmological constraints from the 70, 
100, and 143 GHz channels of Planck with /sky = 0.7, assuming that data collected at other fre- 
quencies will be used for for eground removal. The noise level and beam size for each channel comes 
from the Planck Blue Book ( Planck Collaboration . 20061 ). Information from secondary anisotropics 



of the CMB is not included in this Fisher matrix; in particular, constraints from the ISW effect 
( §7.8p are remo ved by requiring the a ngular diameter distance to the CMB to be matched exactly, 
as described in Albrecht et al. ( 20091 ). Additionally, the large-scale {£ < 30) polarization angular 



power spectrum and temperature-polarization cross power spectrum, which mainly contribute to 
constraints on the optical depth to reionization r, are excluded from the forecast and replaced by 
a Gaussian prior with width = 0.01. This prior accounts for uncertainty in r due to limited 
knowledge of the redshift dependence of reionization, which is not included in the simplest models 
of the CMB anisotropics. Although r does not appear in the parameter set for the Fisher matrices, 
marginalization over r in the CMB constraints contributes to the uncertainty on the primordial 
power spectrum amplitude As, which in turn affects predictions for the growth of large-scale struc- 
ture. 

Combined constraints on cosmological parameters are obtained simply by adding the Fisher 
matrices of the individual probes, i.e. F = F^"^ + F^^° + F^^ + F^^^. Then the forecast for the 
parameter covariance is C = F~^, and in particular the uncertainty on a given parameter pi after 
marginalizing over the error on all other parameters is y[F^^ . 

Computing the Fisher matrix in the FoMSWG parameter space with a large number of inde- 
pendent bins for w{z) gives us the flexibility to project these forecasts onto a number of simpler 
parameterizations, including the WQ-Wa model for the purposes of computing the FoM. To change 
from the original parameter set p to some new set q, we compute 

^ dqk dqi 

which gives the Fisher matrix F for the new parameterization. In particular, projection from bins 
Wi to wq and Wa involves the derivatives dwi/dwo = 1 and dwi/dwa = z/{l + z). We also compute 
the pivot redshift Zp and the uncertainty in the equation of state at that redshift, Wp. Given the 
2x2 covariance matrix dj for wn and Wg (marginalized over the other parameters), the pivot 



values are computed as ( Albrecht et al. . 20091 ) 

Cl2 
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where the first index corresponds to wq and the second to Wa- 

One drawback to the WQ-Wa parameterization is that constraints on w{z) at high redshift are 
coupled to those at low redshift by the form of the model; for example, if observations determine 
the value of the equation of state perfectly at z = and at z = 0.1, then it is completely determined 
at high redshift even in the absence of high redshift data. To specifically address questions related 
to the ability of dark energy probes to constrain dark energy at low redshift vs. high redshift, we 
define an alternative but equally simple parameterization in which w{z) takes constant, independent 
values in each of two bins at z < 1 and z > 1. The projection onto this parameterization using 
equation ()169p requires the derivatives dwi/dw{z < 1) = 0(1 — Zj) and dwi/dw{z > 1) = 1 — 0(1 — 
Zj), where G(x) is the Heaviside step function equal to for x < and 1 for x > 0. 

Principal components (PCs) of the dark energy equation of state provide another way to deter- 
mine which features of the equation of state evolution are best constrained by a given combination of 

J — 1 1 — — ' I I I — — 1 1 1 p — — \ I 

expe r iments (iHuterer arid Starkmanl.' 2003: Hu, 2002; Huterer and Cooravj . 2005: Wang and Tegmarkl . 



2005'; 'Pick et al.'. '200(j: ISimpson and Brid le. 2006; de Putter and Linder, 2008; Tang et a l.'. '201 ll: 
Critt enden et al., 2003: iMortonson et al.l . i2009b: Kitching and Amara, 2009; Mat uri and M ignone. 
2OO9I ). We compute the PCs for each forecast case by taking the total Fisher matrix for the original 



parameter set (eq. I16ip and marginalizing over all parameters other than the 36 binned values of 
Wi. If we call the Fisher matrix for the Wi parameters F"', then the PCs are found by diagonalizing 

F"' = QAQ^ , (171) 

where Q is an orthogonal matrix whose columns are eigenvectors of F"' and A is a diagonal 
matrix containing the corresponding eigenvalues of F"". Up to an arbitary normalization factor, 
the eigenvectors are equal to the PC functions ej = (ej(zi), 64(2:2)1 •••) which d escribe how the binne d 
values of w{z) are weighted with redshift. Here we adopt the normalization of lAlbrecht et al. (|2009l ). 



36 
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^^ei{zk)ej{zk) = ^^ek{zi)ek{zj) = (Aa) ^5, 



(172) 



k=l 



k=l 



where Aa = 0.025 is the bin width; for i = j this condition approximately corresponds to 
Jq ida[ei{a)]'^ = 1. With this convention, the columns of Q are (Aa)^^^ ei . The PCs rotate 
the original set of parameters to a set of PC amplitudes Q"^(l + w) with elements 
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A = (Aa)V2^e,(z,)(l + ^i)- 



(173) 



Combining equations (I172p and (I173p . we can construct w{z) in each redshift bin from a given set 
of PC amplitudes as 

36 

Wi = -l + '^ajejizi) , (174) 
i=i 

where = (Aa)^/^/3j . The accuracy with which the ai can be determined from the data is given 
by the eigenvalues of F"', Uj = cTq. = (Aa/Ajj)^/^, and the PCs are numbered in order of increasing 
variance (i.e. iTj+i > o"i). 

For constrai nts that are margina lized over the Wi parameters, we impose a weak prior on Wi 
as sugg ested bv lAlbrecht et all (j2009l ) to reduce the dependence of forecasts for A7 on the poorly- 
constrained high redshift Wi values, since arbitrarily large fiuctuations in w{z) can alter the high 
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Table 7. Key to forecast variations 



Any X 4 Quadruple fiducial errors (divide Fisher matrix by 16). 
Any X 2 Double fiducial errors (divide Fisher matrix by 4). 
Any/ 2 Halve fiducial errors (multiply Fisher matrix by 4). 

Stage Ill-like SN: total magnitude error of 0.02 per Az = 0.2 bin 
over 0.2 < z < 0.8 and in local sample at z = 0.05. 

Increase max. redshift to ^max = 1-6 (7 bins with Az = 0.2 and 0.01 mag. error). 
Omit local sample at z = 0.05. 

Correlated errors: am,u = (^m,c = 0.007, Az^ = 0.2, with x bins over 0.2 < z < 0.8. 
BAO-III Stage Ill-like BAO, approximating forecasts for BOSS LRGs+HETDEX: 
{D/rs,Hrs) errors of (1.0%, 1.8%) at z = 0.35, (1.0%, 1.7%) at z = 0.6, 
and (0.8%, 0.8%) at z = 2.4. These are "BAO only" forecasts for BOSS 
and "full power spectrum" forecasts for HETDEX. 
BAOzmax Reduce maximum redshift to Zmax = 2 (20 bins), retaining /sky = 0.25. 
WL-opt "Optimistic" Stage IV case (total error= 2x statistical). 
WL-III Stage Ill-like WL, approximating forecasts for DES: 
CMB-W9 Fisher matrix forecast for 9-year WMAP data. 



SN-III 

SNZmax 

SN-local 
SNcx 



redshift growth rate. We include a weak Gaussian prior with width a^. = Aw/^/Aa by adding to 
the total Fisher matrix 

[ , z > 36 , 

assuming that the parameters are ordered as in equation ()16ip with pi = wi, p2 = W2, etc. For most 
forecasts, we use a default prior width of At;; = 10 (fi^- ~ 63), which approximately corresponds 
to requiring that the average value of |1 + tu| in all bins does not exceed 10. In the next section 
we also consider how constraints on certain parameters change with a narrower prior of Aw = 1. 
For priors wider than the default choice, the Fisher matrix computations are subject to numerical 
effects arising from the use of a finite number of Wi bins to approximate continuous variations in 
wlz), so we do not present results with weaker priors than Aw = 10. Note that the construction 
of PCs of w{z) as described above does not include such a prior on Wi. 

8.3. Results: Forecasts for the Fiducial Program and Variations 
8.3.1. Constraints in simple w{z) models 

We begin with forecasts for which the 36 w{z) bins are projected onto the simpler WQ-Wa 
parameter space. Tables iTHTOl give the forecast la uncertainties for the fiducial program and 
numerous variations. Each forecast case is labeled by a list of the Fisher matrices that are added 
together, and the basic variations we consider are simple rescalings of the total errors for each probe; 
for example, [SN/2,BAOx4,WL-opt,CMB] includes the fiducial SN data with the total error halved 
(i.e. the Fisher matrix multiplied by 4), 4 times the fiducial BAO errors, the optimistic version of the 
WL forecast, and the fiducial Planck CMB Fisher matrix. Note that /2 denotes a more powerful 
program and x2 denotes a less powerful program. The key in Table [7] describes other types of 
variations of the fiducial probes. In some cases we omit a probe entirely, e.g. [SN,BAO,WL] sums 
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Table 8. Forecast Uncertainties for Variations of the Fiducial Program 





Forecast case 


Zp 




FoM 




10^ <yn. 


102 au 


cr A7 


flnGg 


1 


[bN,BAO, WL,CMBJ 


0.46 


0.014 


664 


0.051 


0.55 


0.51 


0.034 


0.015 


2 


[SN,BAO,WL-opt,CMB] 


0.39 


0.013 


789 


0.049 


0.64 


0.42 


0.026 


0.016 


3 


[BAO,WL,CMB] 


0.63 


0.017 


321 


0.054 


0.56 


0.99 


0.034 


0.015 


4 


[SN-III,BAO,WL,CMB] 


0.57 


0.016 


433 


0.053 


0.56 


0.75 


0.034 


0.015 


5 


[SNx4,BA0,WL,CMB] 


0.61 


0.017 


353 


0.054 


0.56 


0.91 


0.034 


0.015 


6 


[SNx2,BA0,WL,CMB] 


0.57 


0.016 


433 


0.053 


0.56 


0.75 


0.034 


0.015 


7 


[SN/2,BA0,WL,CMB] 


0.32 


0.010 


1197 


0.049 


0.55 


0.32 


0.034 


0.015 


8 


[SNzmax,BAO,WL,CMB] 


0.42 


0.011 


841 


0.050 


0.55 


0.40 


0.034 


0.015 


9 


[SN-local,BAO,WL,CMB] 


0.59 


0.016 


376 


0.053 


0.56 


0.85 


0.034 


0.015 


10 


iSNc3,BA0,WL,CMB] 


0.46 


0.014 


652 


0.051 


0.55 


0.51 


0.034 


0.015 


11 


iSNc6,BA0,WL,CMB] 


0.46 


0.014 


663 


0.051 


0.55 


0.51 


0.034 


0.015 


12 


[SNcl2,BA0,WL,CMBJ 


0.46 


0.014 


667 


0.051 


0.55 


0.50 


0.034 


0.015 


13 


[SN,WL,CMB] 


0.26 


0.022 


152 


0.321 


2.13 


0.72 


0.038 


0.022 


U 


[SN,BAO-III,WL,CMB] 


0.32 


0.019 


299 


0.120 


1.19 


0.57 


0.035 


0.017 


15 


[SN,BAO X 4, WL,CMBJ 


0.30 


0.020 


245 


0.145 


1.16 


0.65 


0.036 


0.018 


16 


[SN,BAO X 2,WL,CMB] 


0.36 


0.018 


380 


0.087 


0.76 


0.58 


0.035 


0.016 


11 


[SN,BAO /2,WL,CMB] 


0.50 


0.010 


1222 


0.033 


0.47 


0.39 


0.034 


0.014 


18 


[SN,BAOzmax,WL,CMB] 


0.42 


0.014 


547 


0.071 


0.66 


0.52 


0.034 


0.015 


19 


[SN,BAO,CMB] 


0.41 


0.016 


539 


0.059 


0.78 


0.53 






20 


[SN,BAO,WL-III,CMB] 


0.41 


0.016 


543 


0.058 


0.77 


0.52 


0.145 


0.048 


21 


[SN,BA0,WLx4,CMB] 


0.42 


0.016 


553 


0.057 


0.75 


0.53 


0.126 


0.031 


22 


[SN,BA0,WLx2,CMB] 


0.43 


0.015 


587 


0.055 


0.68 


0.52 


0.065 


0.020 


23 


[SN,BA0,WL/2,CMB] 


0.48 


0.012 


815 


0.047 


0.45 


0.47 


0.018 


0.012 


24 


[SN,BAO, WL-opt X 4,CMB] 


0.41 


0.016 


556 


0.058 


0.76 


0.52 


0.085 


0.022 


25 


[SN,BAO,WL-opt X 2,CMB] 


0.41 


0.015 


606 


0.055 


0.73 


0.49 


0.045 


0.018 


26 


[SN,BAO,WL-opt/2,CMB] 


0.37 


0.009 


1397 


0.040 


0.52 


0.30 


0.017 


0.013 


21 


[SN,BAO,WL] 


0.31 


0.020 


368 


0.075 


7.82 


1.48 


0.037 


6.697 


28 


[SN,BA0,WL,CMB-W9] 


0.43 


0.015 


592 


0.055 


1.07 


0.53 


0.036 


0.019 



Note. — Forecasts in this table vary the assumptions about a single probe at a time from the 
fiducial program. With the exception oi w{z > 1), a WQ-Wa model for the dark energy equation of 
state is assumed for all parameter uncertainties here and in Tables [9] and [TOl All forecasts allow for 
deviations from GR parameterized by A7 and Gg. 
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Table 9. Forecast Uncertainties for Variations of the Fiducial Program (Continued) 







Zp 




JTUiVi 


<7ui(z>l) 




iu ah 


CA7 


flnGg 


1 


[SN,BAO,WL,CMB] 


0.46 


0.014 


664 


0.051 


0.55 


0.51 


0.034 


0.015 


2 


[SN-III,BAO-III,WL-III,CMB] 


0.42 


0.032 


131 


0.137 


1.36 


0.96 


0.147 


0.051 


3 


[SN-III,BA0-III,WL-III,CMB-W9] 


0.33 


0.039 


92 


0.174 


2.41 


1.01 


0.148 


0.064 


4 


[SNx4,BAOx4,WLx4,CMB] 


0.51 


0.048 


52 


0.179 


1.32 


1.98 


0.128 


0.033 


5 


[SNx2,BAOx2,WLx2,CMB] 


0.49 


0.026 


188 


0.095 


0.85 


1.00 


0.065 


0.021 


6 


[SN/2,BAO/2,WL/2,CMB] 


0.43 


0.007 


2439 


0.027 


0.34 


0.26 


0.018 


0.011 


7 


[SN/2,BAO/2,WL-opt,CMB] 


0.34 


0.008 


1832 


0.035 


0.55 


0.26 


0.023 


0.014 


8 


[SN-III,BAO-III,WL,CMB] 


0.44 


0.026 


169 


0.126 


1.20 


0.89 


0.035 


0.017 


9 


[SNx4,BAOx4,WL,CMB] 


0.50 


0.034 


85 


0.157 


1.18 


1.49 


0.037 


0.019 


10 


[SNx 4,BA0 X 2,WL,CMB] 


0.57 


0.026 


153 


0.093 


0.77 


1.28 


0.035 


0.016 


11 


[SNx4,BAO/2,WL,CMB] 


0.57 


0.011 


891 


0.033 


0.47 


0.53 


0.034 


0.014 


12 


[SN X 2,BA0 X 4,WL,CMB] 


0.41 


0.029 


132 


0.151 


1.17 


1.01 


0.037 


0.018 


13 


[SN X 2,BA0 X 2,WL,CMB] 


0.49 


0.023 


218 


0.090 


0.76 


0.92 


0.035 


0.016 


U 


[SNx2,BAO/2,WL,CMB] 


0.55 


0.011 


966 


0.033 


0.47 


0.49 


0.034 


0.014 


15 


[SN/2,BAOx4,WL,CMB] 


0.25 


0.012 


499 


0.142 


1.15 


0.47 


0.036 


0.017 


16 


[SN /2,BA0 X 2,WL,CMB] 


0.27 


0.011 


735 


0.084 


0.76 


0.39 


0.035 


0.016 


11 


[SN/2,BAO/2,WL,CMB] 


0.38 


0.008 


1921 


0.032 


0.47 


0.27 


0.034 


0.014 


18 


[SNz^ax,BAOz^ax,WL,CMB] 


0.40 


0.012 


694 


0.069 


0.66 


0.42 


0.034 


0.015 



Note. — Same as Table [51 but varying two or three probes at a time from the fiducial specifications. 
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Table 10. Forecast Uncertainties for Variations of the Fiducial Program (Continued) 





Forecast case 


Zp 




rOiVl 


a. 


w{z>l) 


10 CTQj^ 


10 ah 


cr A7 




1 


[SN,BAO,WL,CMB] 


0.46 


0.014 


664 


0.051 


0.55 


0.51 


0.034 


0.015 


2 


[SN,BAO-III,WL-III,CMB] 


0.29 


0.022 


239 


0, 


,129 


1.35 


0.59 


0.147 


0, 


.051 


3 


[SN,BAO X 4,WL x 4,CMB] 


0.28 


0.022 


185 


0, 


,165 


1.30 


0.77 


0.128 


0, 


.033 


J. 


[SN,BAOx4,WLx2,CMB] 


0.28 


0.021 


200 


0, 


,159 


1.26 


0.73 


0.067 


0, 


.023 


5 


[SN,BAOx4,WL/2,CMB] 


0.35 


0.016 


373 


0, 


,115 


0.98 


0.54 


0.020 


0, 


.014 


6 


SN BAG X 4 WL-oDt CMBl 


0.29 


0.015 


361 


0, 


,102 


1.21 


0.57 


0.042 


0, 


.020 


7 


SN BA0x2 WLx4 CMBl 


0.34 


0.019 


328 


0, 


,092 


0.90 


0.62 


0.127 


0, 


.031 


8 


[SN,BAO X 2,WL x 2,CMB] 


0.35 


0.019 


340 


0, 


,090 


0.85 


0.61 


0.065 


0, 


.021 


9 


fSN BA0x2 WL/2 CMBl 


0.40 


0.015 


502 


0, 


,078 


0.67 


0.51 


0.019 


0, 


.013 


10 


[SN,BAO X 2,WL-opt,CMB] 


0.33 


0.014 


506 


0, 


,072 


0.83 


0.49 


0.033 


0, 


.017 


11 


SN BAO/2 WLx4 CMBl 


0.43 


0.012 


926 


0, 


,041 


0.65 


0.40 


0.126 


0, 


.031 


12 


[SN,BAO/2,WLx2,CMB] 


0.45 


0.011 


1010 


0, 


,038 


0.59 


0.40 


0.064 


0, 


.020 


13 


[SN,BAO /2,WL /2,CMB] 


0.54 


0.008 


1585 


0, 


,028 


0.34 


0.38 


0.018 


0, 


.012 


U 


[SN,BAO /2,WL-opt,CMB] 


0.43 


0.010 


1251 


0, 


,035 


0.55 


0.35 


0.023 


0, 


.015 


15 


[SN-III,BAO,WL-III,CMB] 


0.54 


0.019 


346 


0, 


,060 


0.77 


0.79 


0.146 


0, 


.048 


16 


[SNx4,BAO,WLx4,CMB] 


0.60 


0.020 


277 


0, 


,060 


0.75 


0.99 


0.126 


0, 


.031 


17 


[SNx4,BAO,WLx2,CMB] 


0.60 


0.019 


298 


0, 


,058 


0.68 


0.97 


0.065 


0, 


.020 


18 


[SNx4,BAO,WL/2,CMB] 


0.59 


0.014 


486 


0, 


,049 


0.45 


0.75 


0.018 


0, 


.012 


19 


[SNx4,BAO,WL-opt,CMB] 


0.47 


0.014 


568 


0, 


,049 


0.64 


0.56 


0.026 


0, 


.016 


20 


[SNx2,BAO,WLx4,CMB] 


0.54 


0.019 


351 


0, 


,059 


0.75 


0.79 


0.126 


0, 


.031 


21 


[SNx 2,BA0,WL x 2,CMB] 


0.55 


0.018 


375 


0, 


,057 


0.68 


0.78 


0.065 


0, 


.020 


22 


[SNx2,BAO,WL/2,CMB] 


0.56 


0.013 


567 


0, 


,048 


0.45 


0.65 


0.018 


0, 


.012 


23 


[SNx2,BAO,WL-opt,CMB] 


0.45 


0.014 


619 


0, 


,049 


0.64 


0.52 


0.026 


0, 


.016 


24 


[SN/2,BAO,WLx4,CMB] 


0.28 


0.011 


998 


0, 


,056 


0.74 


0.33 


0.126 


0, 


.031 


25 


[SN/2,BAO,WLx2,CMB] 


0.30 


0.011 


1061 


0, 


,053 


0.67 


0.33 


0.065 


0, 


.020 


26 


[SN/2,BAO,WL/2,CMB] 


0.35 


0.009 


1430 


0, 


,045 


0.44 


0.30 


0.018 


0, 


.012 


27 


[SN/2,BAO,WL-opt,CMB] 


0.30 


0.010 


1242 


0, 


,049 


0.64 


0.30 


0.026 


0, 


.015 



Note. — Continuation of Table M 
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the fiducial Fislier matrices of the three main probes but does not include the Planck CMB priors. 
Note that even though we assume a specific systematic error component in computing certain Fisher 
matrices (in particular, F^'^), the cases with rescaled errors simply multiply each Fisher matrix 
by a constant factor and thus do not distinguish between statistical and systematic contributions 
to the total error. 

Constraints on the equation of state are given in Tables [HHlOlby the DETF FoM and the error on 
Wp. The rule of thumb that = (FoM x c^p)"^ ~ Wa^^ holds at the ~ 30% level for most of the 
forecast variations we consider — i.e., at the best-constrained redshift, the value of w is typically 
determined a factor of ten better than the value of its derivative. The forecast tables also list the 
uncertainty in the high redshift equation of state 'w{z > 1) for the alternative parameterization 
where w{z) takes independent, constant values at z < 1 and z > 1. Note that all of these w{z) 
constraints are marginalized over uncertainties in Gg and A7, so they do not assume that structure 
growth follows the GR prediction. 

For the fiducial program outlined in ^8.11 the DETF FoM is projected to be around 600-800, 
depending on whether the WL forecast uses the default systematic error model or the optimistic 
model. This is roughly an order of magnitude larger than the FoM forecast for a combination 
of Stage III experiments (e.g. see Table [U rows 2-3) and nearly two orders of magnitude larger 
than current, "Stage 11" FoM values (~ 10). The equation of state in the WQ-Wa parameterization 
is best measured by the fiducial set of Stage IV experiments at a redshift Zp ~ 0.5 with a la 
precision of a^p ~ 0.014, and the time variation of w{z) is determined to within cj^^ ~ 0.11. The 
fiducial program also yields impressive constraints of 5.5 x 10~^ on il^ and 0.51 km s^^ Mpc^^ on 
Hq. Forecast la errors for the modified gravity parameters are 0.034 on A7 and 0.015 on InGg. 
We caution, however, that the O^, Hq, and Gg errors (but not the A7 error) are sensitive to our 
assumption of the WQ-Wa parameterization (see Figures [33H371 below) . CMB constraints make a 
critical contribution — the FoM drops from 664 to 368 if they are omitted entirely (Table [HI line 
27) — but the difference between Planck precision and anticipated WMAP9 precision is modest 
(line 28) except for fi^, where it is a factor of two. 

Figure [30] illustrates the key results of our forecasting investigation, highlighting many aspects 
of the interplay among the three observational probes. In the upper left panel, the solid curve 
shows how the FoM changes as the total SN errors vary from four times fiducial to half fiducial, 
keeping the other probes (BAO, WL, and CMB) fixed at their fiducial levels. Other curves show 
the effect of doubling WL or BAO errors or switching to the optimistic WL forecast. The lower 
panels show analogous results from varying the BAO or WL errors, while the upper right panel 
shows the effect of changing the maximum redshift of the SN program. Over the range of variations 
plotted in Figure [30l the FoM varies from just over 100 to almost 1400. 

The scaling of the FoM with the forecast errors is not uniform among the three main probes. 
Starting from the fiducial program, the effect of doubling or halving errors is greater for BAO 
than for SN, and greater for SN than for WL. This scaling implies that BAO data provide the 
greatest leverage in these forecasts. However, the hierarchy of the three probes is sensitive to the 
assumptions about each experiment; in particular, assuming the optimistic version of WL errors 
promotes WL from having the least leverage on the FoM to having the most leverage. More 
generally, the fact that varying the errors of any individual probe changes the FoM noticeably 
demonstrates the complementarity of the methods. 

Unlike many previous FoM forecasts, we marginalize over the structure growth parameters 
A7 and InGg, which tends to increase the uncertainties on wq and Wa- In most cases, the dif- 
ference between the marginalized constraints and ones obtained under the assumption of GR 
(A7 = InGg = 0) is small, but the difference is greater if WL contributes significantly to ex- 
pansion history constraints; for example, for the fiducial program, the change in the FoM due to 
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Figure 30 The DETF FoM, (o"u,pCr^^)^^, for the fiducial program and simple variants. In each 
panel, the open circle marks the FoM of the fiducial program. In the upper left panel, the other 
points along the solid curve show the effect of scaling the error on the SN measurements by factors 
of 2 or 4 while keeping errors for other probes fixed at their fiducial values. Dotted, short-dashed, 
and long-dashed curves show the effect of, respectively, doubling the BAO errors, doubling the WL 
errors, or adopting the optimistic WL forecasts in which systematic errors are simply twice the 
statistical errors. Other panels show analogous results, but instead of scaling the total SN error 
they scale the total BAO error (lower left), the total WL error (lower right), or the maximum 
redshift of the SN constraints (upper right). In each panel, the dashed gray line marks the forecast 
performance of Stage III probes (including Planck) with FoM=131. 
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assuming GR is only 664 — )• 771, whereas with the WL-opt forecast the change is 789 — )• 1119. 

The local calibrator sample plays an important role in the SN constraints. Omitting the mea- 
surement at ^; = 0.05 reduces the FoM from 664 — t- 376 (Tabled line 9). Even replacing it with a 
measurement over a broad low-redshift bin < z < 0.2, still with an error of 0.01 mag, reduces the 
FoM from 664 — )• 533 because it increases degeneracy between the supernova absolute magnitude 
scale and dark energy parameters. Reducing the redshift of the calibrator sample below 0.05 makes 
little further difference, and at lower redshifts peculiar velocity uncertainties may become too large 
to remove with high precision. It is also interesting to ask whether it is better to go after SNe 
at high redshifts or to focus on reducing the errors on SN data at low redshifts. Comparing the 
upper panels of Figure [30l we find that the benefit from reducing errors is typically greater than 
that from obtaining SNe beyond z ~ 1, at least for the FoM. For example, reducing the error per 
redshift bin from 0.01 mag (the fiducial value) to 0.005 mag raises the FoM by a factor of 1.80, but 
increasing the maximum redshift from 0.8 to 1.6 raises the FoM by only 1.27 (see Table[8|). If BAO 
errors are doubled, the FoM drops substantially, but SN errors still have much greater leverage 
than SN maximum redshift. We have assumed in these forecasts that the error per redshift bin 
stays constant as the maximum SN redshift increases, but in reality higher redshift SNe are likely 
to have larger systematic errors associated with them, which would diminish the gains from high 
redshift SNe even more than indicated by the flattening of curves in Figure [30j Of course, once 
the systematic errors at z < 0.8 are saturated, pushing to higher redshift may be the only way to 
continue improving SN constraints. 

The weak dependence of w{z) constraints on the maximum SN redshift extends to other pa- 
rameters as well. Figure [31] compares the effect on la errors of varying the maximum SN redshift 
to that of varying the maximum BAO redshift. For the WQ-Wa model, the errors on all parameters 
are relatively insensitive to changes in the maximum SN redshift at z > 1, but the errors on Wa 
and fife decrease by a factor of a few as the maximum BAO redshift increases from z = 1 to z = 3. 
Likewise, the high redshift equation of state w{z > 1) can be determined much more precisely as 
BAO data extend to higher redshifts, but it depends little on the maximum SN redshift. For the 
fiducial Stage IV forecasts, only the Hubble constant error depends significantly on the depth of SN 
observations (assuming a WQ-Wa model). More pessimistic assumptions about the achievable BAO 
errors enhance the importance of high redshift SNe for determining Wp (dotted line in Figure [3T]) , 
but the dependence of other parameters on Zmax for the SN data remains weak. 

8.3.2. Constraints on structure growth parameters 

While the DETF FoM is a useful metric for studying the impact of variations in each of the dark 
energy probes, it does not tell the whole story. Deviations from the standard model might show up 
in other sectors of the parameter space; for example, a detection of non-GR values for the growth 
parameters A7 and Gg could point to a modified gravity explanation for cosmic acceleration that 
would not be evident from measurements of w{z) alone. Thus, even the less optimistic version of the 
WL experiment, which adds relatively little to the w{z) constraints obtained by the combination 
of fiducial SN, BAO, and CMB forecasts, is a critical component of a program to study cosmic 
acceleration because of its unique role in determining the growth parameters A7 and Gg . 

The impact of various experiments on the structure growth parameters is more evident if we 
extend the DETF FoM to include A7 in addition to wq and Wa- As shown in Figure [32] the scaling 
of this new FoM with respect to WL errors (and, to a lesser extent, BAO errors) is much steeper 
than it is for the usual FoM (Figure [30]) . We do not show the scaling with SN errors or Zmax since 
those assumptions do not affect the expected uncertainties for A7 and Gg (see Table [HI lines 3-12). 
One could also consider versions of the FoM that include uncertainties in Gg and that account for 
the correlations between the structure growth parameters and the dark energy equation of state. 
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Figure 31 Variation of la parameter errors with the maximum redshift for BAO (left) or SN data 
(right). For the sohd curves, fiducial Stage IV forecasts are assumed for all other probes. The 
dotted curve in the right panel shows the scaling of cr{wp) with SN Zmax assuming 4 times larger 
BAO errors (BA0x4). The plotted errors assume a WQ-Wa parameterization (except for w[z > 1)). 

The complementarity between the SN, BAO, and WL techniques is further demonstrated in 
Figures [33H351 which show the forecast 68% confidence level contours in the wo,^-Wa and A7-lnG9 
planes after marginalizing over other parameters. Instead of wq we plot wq,^, the equation-of-state 
parameter at z = 0.5, because it is much less correlated with Wa for most of the forecast scenarios. 
In every panel, the blue ellipse shows the error contour of the fiducial forecast while other ellipses 
show the effect of varying the errors of the indicated method. The opposite orientation of ellipses 
in Figures [33l and [34l demonstrates the complementary sensitivity of SN and BAO to w^z): the SN 
data are mainly sensitive to the equation of state at low redshift, whereas BAO data measure the 
equation of state at higher redshift. However, the sensitivity to the beyond-GR growth parameters 
comes entirely from WL data, which provide the only direct measurements of growth, and the 
strength of the A7 and Gg constraints depends directly on the WL errors, as shown in Figure [35l 
Conversely, these constraints are very weakly sensitive to the SN or BAO errors (Figs. [331 and . 
showing that the uncertainties are dominated by the growth measurements themselves rather than 
residual uncertainty in the expansion history. Inspection of Table [8] shows that the A7 constraints 
are essentially linear in the WL errors, while the InGg constraints scale more slowly. 

Although the WQ-Wa parameterization is fiexible enough to describe a wide variety of expansion 
histories, it is too simple to account for all possibilities; in particular, w^z) is restricted to functions 
that are smooth and monotonic over the entire history of the universe. Because many cosmolog- 
ical parameters are partially degenerate with the dark energy evolution, assumptions about the 
functional form of w{z) can strongly affect the precision of constraints on other parameters. As 
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Figure 32 FoM scaling with BAO errors (left) and WL errors (right) including changes in the error 
on A7, normalized to the forecast uncertainty for the fiducial program, = 0.034. The fiducial 
Stage IV forecast is marked by an open circle. For the Stage III forecast, FoMx(cr|<^/crA7) = 30. 
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Figure 33 Forecast constraints (68% confidence levels) for dark energy and growth parameters, 
varying errors on SN data: fiducial x 4 (red), x2 (green), xl (blue), and /2 (black). In all cases, the 
fiducial forecasts are used for the other probes (BAO, WL, CMB). Contours in the left panel use 
the value of the equation of state at z = 0.5 (close to the typical pivot redshift), wq.s = + Wa/S. 
Dashed contours in the right panel show the errors on growth parameters for the binned w{z) 
parameterization, with the default priors corresponding to deviations of < 10 in the average value 
of w. Solid contours assume a WQ-Wa parameterization. In both cases, the Gg and A7 constraints 
are essentially independent of the SN errors. 

an example of this model dependence, the right panels of Figures [33H35] show how the constraints 
on the growth parameters weaken (dashed curves) if one allows the 36 binned Wi values to vary 
independently instead of assuming that they conform to the WQ-Wa model. While A7 forecasts are 
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Figure 34 Same as Fig. [33l but varying BAO errors from fiducial x 4 (red) to fiducial/2 (black). 




Figure 35 Same as Fig. 1331 but varying WL errors from fiducial x 4 (red) to fiducial/2 (black). Lower 
panels assume the optimistic WL forecasts. 

only mildly affected by the choice of dark energy modeling, constraints on the z = 9 normalization 
parameter Gg depend strongly on the form of w{z). This dependence follows from the absence of 
data probing redshifts 3 < z < 9 in the fiducial Stage IV program. In the WQ-Wa model, dark 
energy evolution is well determined even at high redshifts, since the two parameters of the model 
can be measured from data at z < 3, and thus the growth function at z = 9 is closely tied to the 
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low redshift growth of structure measured by WL. However, allowing w{z) to vary independently at 
high redshift where it is unconstrained by data decouples the low and high redshift growth histories, 
and therefore Gg can no longer be determined precisely. In fact, the constraints on Gg in that case 
depend greatly on the chosen prior on Wi (taken to be the default prior of fi^. = 10/ V Aa in Fig- 
ures [33H35]) . One important consequence of this dependence on the w^z) model is that an apparent 
breakdown of GR via Gg 7^ 1 might instead be a sign that the chosen dark energy parameterization 
is too restrictive. 

8.3.3. Dependence on w{z) model and binning of data 

Other parameters are also affected to varying degrees by the choice of w{z) model and the 
priors on the model parameters. Figure [36] shows how errors on fifc and h are affected by relaxing 
assumptions about dark energy evolution. For the fiducial program and minor variants, ilk is very 
weakly correlated with wq and Wa, resulting in similar errors on curvature for the WQ~Wa and ACDM 
models. However, generalizing the dark energy parameterization to include independent variations 
in 36 redshift bins can degrade the precision of measurements by an order of magnitude or 
more. In that case, the error on is very sensitive to the chosen prior on the value of Wi in each 
bin, and it improves little as the BAO errors decrease. This dependence on priors reflects the fact 
that curvature is most correlated with the highest redshift Wi values, which are poorly constrained 
by the fiducial combination of data. Relative to curvature, constraints on the Hubble constant are 
affected more by the choice of dark energy parameterization but less by priors on Wi in the binned 
w{z) model. 




Figure 36 Dependence of crn^ (left) and (right) on BAO errors for various dark energy param- 
eterizations and priors. For the Wi curves, the equation of state varies independently in 36 bins 
with Gaussian priors of width cr^. = Aw/V Aa. The fiducial versions of the Stage IV SN, WL, and 
CMB data are included in all cases. 

Figure [37] shows the dependence of ah on the precision of SN data for various dark energy 
parameterizations (0"^^, is nearly independent of the SN errors for this range of variations around 
the fiducial forecast; see Table [Sj). If we assume a WQ-Wa model for dark energy, Hubble constant 
errors strongly depend on the precision of SN data. However, Fig.l37lshows that either decreasing or 
increasing the number of dark energy parameters can almost completely eliminate the dependence 
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Figure 37 Dependence of ah on SN errors for various dark energy parameterizations and priors, 
including the fiducial BAO, WL, and CMB forecasts. 



of ah on the SN data. In the case of the simpler ACDM model, the combination of the fiducial 
BAO, WL, and CMB forecasts is sufficient to precisely determine all of the model parameters, and 
adding information from SN data has a negligible effect on the parameter errors. Adding wq and Wa 
to the model introduces degeneracies between these dark energy parameters and other parameters, 
including h. Since constraints from SN data help to break these degeneracies, reducing SN errors 
can significantly improve measurement of the Hubble constant in the WQ~Wa model. 

As one continues to add more dark energy parameters to the model, the degeneracies between 
these parameters and h increase, but another effect arises that diminishes the impact of SN data 
on ah- Measurement of the Hubble constant requires relating observed quantities at z > (e.g. 
SN distances) to the expansion rate at z = 0. In the case of ACDM or the WQ~Wa model, the 
assumed dark energy evolution is simple enough that this relation between z = and low-redshift 
observations is largely set by the model. However, when we specify w{z) by a large number of 
independent bins in redshift, this relation must instead be determined by the data. Since SN data 
are only sensitive to relative c hanges in distance s , the lo west-redshift Wi value (centered at z 0.01) 
is strongly degenerate with h ( Mortonson et al. . 2009al ). This degeneracy is partially broken by the 
local SN sample at z = 0.05: removing it from the forecasts increases the error on h from 0.44 to 
0.48 in the binned w{z) parameterization, and from 0.0051 to 0.0085 in the WQ-Wa model. SNe at 
even lower redshifts are more sensitive to the Hubble constant, but they also have larger systematic 
uncertainties due to peculiar velocities. 

For BAO data, the choice of redshift bin width affects forecasts for models with general equation- 
of-state variations. Measurements of H(z) and D{z) in narrower bins are better able to constrain 
rapid changes in w{z). They can also reduce uncertainty in the Hubble constant by about a factor 
of two, and in other parameters such as ^Ik, InGg, and A7 by a smaller amount, relative to 
measurements in wide bins. However, in practice one cannot reduce the bin size indefinitely, since 
each bin must contain enough objects to be able to robustly identify and locate the BAO peak; for 
example, requiring that the bin be at least wide enough to contain pairs of objects separated by 
~ 100 Mpc along the line of sight sets a lower limit of Az/{1 + z) > 0.03. We do not attempt 



180 



to optimize the choice of bins for the simphfied forecasts in this section, but we note that binning 
schemes in analyses of BAO data aimed at constraining general w{z) variations should be chosen 
with care to avoid losing information about dark energy evolution and other parameters. Similar 
concerns are likely to apply for WL data as well. 

8.3.4- Constraints on w{z) in the general model 

So far, in the context of general dark energy evolution we have only considered the forecast errors 
on parameters such as h and that are partially degenerate with w{z). But how accurately can 
w{z) itself be measured when we do not restrict it to specific functional forms? Since the errors 
on Wi values in different bins are typically strongly correlated with each other, it is not very useful 
to simply give the expected Wi errors, marginalized over all other parameters. Instead, we can 
consider combinations of the Wi that are independent of one another and ask how well each of these 
combinations can be measured by the fiducial program of observations. 

As mentioned in §2.21 many methods for combining w{z) bins into independent (or nearly 
independent) components have been proposed. Here we adopt the principal component (PC) de- 
composition of the dark energy equation of state. Starting from the Fisher matrix for the combined 
acceleration probes, the PCs are computed by first marginalizing the Fisher matrix over everything 
except for the Wi parameters and then diagonalizing the remaining matrix, as described above in 
^8.21 The shapes of the three best-measured PCs for the fiducial program (with both fiducial and 
optimistic WL assumptions) and some simple variations are plotted in Figure [38l In general, the 
structure of the PCs is similar in all cases; for example, the combination of Wi that is most tightly 
constrained is typically a single, broad peak at z < 1, while the next best-determined combination 
is the difference between w{z ~ 0.1) and w{z ~ 1). However, variations in the forecast assumptions 
slightly alter the shape of each PC and, in particular, shift the redshifts at which features in the 
PC shapes appear. Changes in the location of the peak in the first PC mirror the dependence of 
the pivot redshift Zp for the WQ-Wa model in Tables [HHlOl with improved SN data decreasing the 
peak redshift and improved BAO data increasing it. The direction and magnitude of these shifts 
reflects the redshift range that a particular probe is most sensitive to and the degree to which that 
probe contributes to the total constraints on w{z). Note that so far we have only considered the 
impact of forecast assumptions on the functional form of PCs, and not on the precision with which 
each PC can be measured. In general, altering the forecast model changes both the PC shapes and 
PC errors, which complicates the comparison among expected PC constraints from different sets 
of forecasts. 

Comparing the top and bottom rows of panels in Figure [38l we see again the contrast between 
the flducial WL forecast and the "WL-opt" forecast with reduced systematic errors. In the former 
case, decreasing WL errors by a factor of two has a negligible effect on the PC shapes relative to 
similar reductions in SN or BAO errors. However, when we take WL-opt as the baseline forecast 
the PCs depend more on the precision of WL measurements and less on that of the SN or BAO 
data. 

The full set of PCs for the fiducial program is shown in Figure [39l and the forecast errors on the 
PC amplitudes are listed in Table [TTl The best-measured, lowest-variance PCs vary smoothly with 
redshift, corresponding to averaging w{z) over fairly broad ranges in z. There is a clear trend of 
increasingly high frequency oscillations for higher PCs. Visual inspection of Figure [39] shows that 
the sum of the number of peaks and the number of troughs in the PC is equal to the index of the 
PC, a pattern that continues at least up to PC 13. Higher PCs often change sign between adjacent z 
bins. High frequency oscillations in w{z) are poorly measured by any combination of cosmological 
data because the evolution of the dark energy density, which determines H(z), depends on an 
integral of w{z) (eq. [22]) . and D{z) and G{z) depend (approximately) on integrals of H{z). Rapid 
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Table 11. Errors on PC Amplitudes for the Fiducial Program 



i 


4' 


opt 


1 


0.011 


0.009 


2 


0.017 


0.014 


3 


0.026 


0.019 


4 


0.038 


0.026 


5 


0.052 


0.036 


6 


0.067 


0.047 


7 


0.083 


0.062 


8 


0.099 


0.074 


9 


0.115 


0.089 



i 




opt 


10 


0.135 


0.102 


11 


0.143 


0.116 


12 


0.168 


0.137 


13 


0.180 


0.150 


14 


0.185 


0.160 


15 


0.216 


0.179 


16 


0.252 


0.240 


17 


0.310 


0.244 


18 


0.323 


0.308 



i 




opt 


19 


0.442 


0.378 


20 


0.779 


0.413 


21 


0.824 


0.436 


22 


0.939 


0.531 


23 


0.978 


0.609 


24 


1.212 


0.725 


25 


1.307 


0.892 


26 


1.457 


1.036 


27 


1.587 


1.561 



i 




opt 


28 


1.652 


1.810 


29 


2.285 


2.217 


30 


3.243 


2.973 


31 


6.540 


6.785 


32 


12.43 


19.20 


33 


16.59 


24.78 


34 


25.17 


46.41 


35 


59.32 


94.09 


36 


74.12 


118.0 



Note. — af refers to errors for the fiducial Stage IV pro- 
gram (CMB+SN+BAO+WL) and c7°p* to the optimistic WL case 
(CMB+SN+B A0+ WL-opt) . 



Table 12. Comparison of Figures of Merit for Selected Forecasts 



Forecast case 


loglon'=l + 




[a{wp)a{wa)] ^ 


[SN,BAO,WL,CMB] 


20.2 


124 


664 


[SN/2,BA0,WL,CMB] 


20.8 


176 


1197 


[SN,BA0/2,WL,CMB] 


26.0 


186 


1222 


[SN,BA0,WL/2,CMB] 


21.6 


140 


816 


[SN,BAO,WL-opt,CMB] 


23.0 


157 


789 


[SN/2,BAO,WL-opt,CMB] 


23.4 


199 


1242 


[SN,BAO/2,WL-opt,CMB] 


27.9 


205 


1251 


[SX.BAO.WL-op(y2.CMB] 


20.0 


210 


1397 
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Figure 38 The three best-measured PCs for the fiducial program (sohd curves) and from programs 
with SN, BAO, or WL errors halved (as labeled). The top row uses the fiducial version of the WL 
forecast, while the bottom row uses the optimistic WL forecast with reduced systematic errors. 
Although not indicated in the plot legends, all forecasts here include the default Planck CMB 
Fisher matrix. For all PCs shown here, ei{z) is nearly zero for 3 < z < 9. 

oscillations in w{z) tend to cancel out in these integrals. Many of the most poorly-measured PCs 
depend on the chosen BAO binning scheme, since narrower BAO bins can better sample rapid 
changes in w{z). As an example, we show how the PCs of the fiducial program are affected by 
doubling the number of BAO bins in Figure [39l 

The maximum redshift probed by SN, BAO, and WL data, primarily set by the highest-redshift 
BAO constraint at z = 3 in our forecasts, imprints a clear signature in the set of PCs in Figure [39l 
At high redshift, specifically z > 3 (a < 0.25), the first 29 PCs have almost no weight. Conversely, 
PCs 30 and 32-36 only vary significantly at high redshift and are nearly flat for z < 3; additionally, 
the errors on these PCs are many times larger than those of the first 29 PCs@ Thus, w{z) variations 
above and below z = 3 are almost completely decoupled from each other in the fiducial forecasts, 
and the high-redshift variations are effectively unconstrained. CMB data limit the equation of state 
at z > 3 to some extent, for example, through comparison of the measured distance to the last 
scattering surface with the distance to z = 3 measured in BAO data. However, such constraints 
are very weak when split among several independent w^z) bins at high redshift. Furthermore, 
since the dark energy density typically falls rapidly with increasing redshift, variations in w{z) at 
high redshift are intrinsically less able to affect observable quantities than low-redshift variations. 



^■^Note that our Wi parameterization has exactly (0.25 — 0.1)/0.025 — 6 bins at 3 < z < 9 and 30 bins at z < 3. 
PC 31 parameterizes variations in the lowest redshift bin wi, which is poorly constrained as discussed in i|8.3.3l 
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Figure 39 PCs for the fiducial program (solid blue curves). Dotted red curves double the number 
of bins used for BAO data from the default choice of 20 to 40. 
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resulting in reduced sensitivity to the high-redshift equation of state even in the presence of strong 
constraints at earher epochs. Likewise, variations in ■w{z) at even higher redshifts of z > 9, where 
we assume that w is fixed to —1, are unhkely to significantly affect constraints on w{z) at low 
redshiftll^ 

Figure HQ] shows how the inverse variance cj~^ of the 10 best-measured w{z) PCs increases 
relative to the fiduci al program if we halve the errors on the SN, BAO, or WL data. Following 
Albrecht et all (|2009l l. when computing these ratios Cf-ov/'^d v (where 1 denotes the fiducial program 



'(2)i/"(l)i 

and 2 the improved program), we first limit PC variances to unity by making the substitution 
a^'^ — )• l + fT,~^, so that uninteresting improvements in the most poorly-measured PCs do not count 
in favor of a particular forecast. We caution that, as noted earlier, the PC shapes themselves are 
changing as we change the errors assumed in the forecast, so a'^2)i ^^'^ variances of 

identical w{z) components. However, as shown in Figure \3E[ these changes are not drastic if we 
consider factor-of-two variations about our fiducial program. 

The differences in a^'^ ratios among improvements in SN, BAO, and WL errors is striking. 
Relative to the fiducial program, reduced SN errors mainly contribute to knowledge of the first few 
PCs. For the fiducial WL systematics, reducing WL errors helps to better measure several of the 
highest-variance PCs in the plot {i > 10), but it makes little difference to the well measured PCs. 
Reducing BAO errors tightens constraints on nearly all of the PCs, with the greatest impact in 
the intermediate range between the SN and WL contributions. Assuming the optimistic WL errors 
gives much greater weight to WL improvements, which now produce the largest improvement in 
the first five PCs (right panel of Figure H0|) . The trends for reducing SN or BAO errors are similar 
to before, but the magnitude of their effect is smaller because they are competing with tighter WL 
constraints. The behavior of the a^"^ ratios of the best-measured PCs mirrors that shown for the 
DETF FoM in Figure [30l With the fiducial WL systematics, BAO measurements have the greatest 
leverage, followed by SN, and the impact of reducing WL errors is small. With the optimistic WL 
systematics, on the other hand, reducing WL errors makes the largest difference, followed by BAO, 
followed by SN. 

Dotted curves in the left hand panel show the a~'^ ratios when we fix the PCs to be those of the 
fiducial program. In this case, the PC errors for the improved programs are no longer uncorrelated, 
but the correlation coefficient of errors among any pair of PCs is less than 0.5 in nearly all cases. 
Results are similar to before except for the first component (first two components for BAO). These, 
of course, show less improvement when they are fixed to be those of the fiducial program rather 
than shifting to be the components best determined by the improved data. Figure Hi] shows the 
expected improvements in a^"^ between our fiducial Stage III and Stage IV programs. Consistent 
with the DETF FoM plots in Figure [30] the expected improvements are dramatic, and considerably 
more so with the optimistic WL assumptions. 

The DETF FoM compresses constraints in the WQ — Wa model to a single number. Similar figures 
of merit for PC constraints have been defined in the literature, in various forms, each of which may 
be useful for different purposes. These include the determinant of F"', which characterizes the total 
volume of parameter space allowed by a particular combination of experiments in analogy to the 
DETF FoM for the vjQ-Wa parameter space, and the sum of the inverse variances of the PCs, which is 
typically less sensitive than the determinant to changes in the errors of the most weakly constrained 



This partly depends on the choice of fiducial model at which the Fisher matrix used to construct the PCs is 
computed. Taking a fiducial model with a larger dark energy density at high redshift than in ACDM makes the low- 
redshift PC shapes more sensitive to assumptions about the high-redshift equation of state (e.g., lde Putter and Lindeil 
|2008| V 
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Figure 40 Ratios of inverse variances of PC amplitudes for variants of the fiducial program to the 
fiducial inverse variances (points and solid curves). Each variant divides SN, BAO, or WL errors 
by a factor of 2 while keeping other probes fixed at the fiducial errors. The left panel assumes the 
default WL forecast and the right panel assumes the optimistic version. Dotted curves in the left 
panel use di instead of fij, which describes how well the amplitudes of the fiducial set of PCs are 
expected to be measured by some variant of the fiducial forecast. 




Figure 41 Ratios of inverse variances of PC amplitudes of Stage IV to those of Stage III, assuming 
either the fiducial or optimistic versions of the Stage IV WL forecast. 
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Examples of these FoMs for the fiducial program and the variants considered in Figure HO] are 
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listed in Table [121 Here we allow the PC basis to change with the forecast assumptions, so F"* is 
diagonal and detF"' = ni=i ^^"^ • with the ratios of PC variances in Figure SOI we restrict the 
variances to be less than unity by replacing 0"^"^ ^ 1 + <7j~^. The other FoM, computed as the sum 
of inverse variances, requires no such prior because PCs with large variances contribute negligibly 
to the sum. Note that the choice of PC FoM definition can affect decisions about whether one 
experiment or another is optimal; for example, halving WL errors (assuming fiducial systematics) 
relative to the fiducial model increases the detF"' FoM more than halving SN errors, but the 
opposite is true for the sum of inverse variances, which favors improvements in the best-measured 
PCs and more closely tracks the DETF FoM. In this case, at least, we regard the latter measure 
as a better diagnostic, since the improvements for PCs that are poorly measured in any case seem 
unlikely to reveal departures from a cosmological constant or other simple dark energy models. 

The disagreement between different PC FoMs in Table [12] highlights one of the difficulties with 
using PCs or related methods for evaluating the potential impact of future experiments. Forecasts 
for PCs provide a wealth of information in both the redshift-dependent shapes of the PCs and 
the expected errors on their amplitudes, but it is often difficult to interpret what this information 
implies about cosmic acceleration. Given a set of forecasts for PCs, one can easily compute the 
expected constraints on any specific model for w{z) by expressing the model in terms of the PC 
amplitudes (eq. I173P : this is a potentially useful application, but it makes very limited use of the 
available information. 

More generally, we can use the forecast PC shapes and errors to try to visualize what types of 
w{z) variations are allowed by a certain combination of experiments. One approach is to generate 
several random w{z) curves that would be consistent with the forecast measurements. This method 
is easily implemented with the PCs because the errors on different PC amplitudes are uncorrelated. 
One can generate a random realization of w{z) by simply drawing an amplitude Ui from a Gaussian 
distribution with mean zero and width Uj, then using equation (jl74p to compute w{z) corresponding 
to the randomly-drawn values. 

In the upper left panel of Figure H2l we use this method to plot several w{z) models using the 
fiducial program PC shapes and errors from Figure [39] and Table [TT| respectively. We cut off the 
plot at z = 3, since w{z) variations at higher redshifts are essentially unconstrained by the fiducial 
experiments. Even at lower redshifts, though, the allowed w{z) variations are enormous, with Wi 
values often changing by 10 or more from one bin to the next. (Recall that our prior corresponds 
to a Gaussian of width cr^^,^ 63 per bin, eq. 11711 ) Compared to the ~ 1.5% constraints on Wp in 
the WQ-Wa model, this forecast looks rather depressing. The consequence of allowing the equation 
of state to be a free function of redshift is that it is nearly impossible to say with any certainty 
what the value of w is at any specific redshift, because rapid oscillations in w{z) have tiny effects 
on observables. The allowed range of variations would be even larger if we considered a model with 
finer Aa bins. 

The large variations of w{z) in Figure 1^ are driven by the poorly constrained PCs, which have 
many oscillations in w{z), peak-to-peak amplitudes \Aw\ ~ 4, and normalization uncertainties 
(Tj ~ 0.1 — 2.3 (see Figure [39] and Table [TT]) . The lower left panel of Figure [^ shows these w^z) 
realizations averaged over bins of width Az = 0.4, which vastly reduces the range of variations, 
especially at z ~ 1. However, the dispersion of w{z) in the bins centered at z = 0.6 and z = 1 is 
still about 0.3. Adding a precise, independent measurement of Hq reduces the uncertainty in w{z) 
in the lowest-redshift bin, but it has little effect at higher redshifts (see ^8.5.ip . 

Instead of averaging w{z) over wide redshift bins, one can impose a theoretical prejudice for 
models with smoothly-varying equations of state by adding an off-diagonal prior to the Fisher 
matrix, imposing corre lations among the Wi that are closely separated in redshift. Here we follow 
Crittenden et all (|2009l ). but we modify their method to use scale factor rather than redshift as the 
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Figure 42 Reconstruction of w{z) from PC constraints. Left: 20 randomly-generated models that 
would be indistinguishable from a cosmological constant using the fiducial program of experiments. 
Three of the 20 models are highlighted (in red, green, and blue) to more clearly show examples 
of the evolution with redshift. The lower panel shows the average of 1 + w{z) in bins of width 
Az = 0.4 for the same models as in the upper panel. Points along the w{z) = —1 line in the upper 
panel mark the centers of the bins in which w{z) is allowed to vary in our forecasts. Right: w{z) 
reconstruction including a prior of the form in equation ()177p . The upper panel shows a random 
selection of models consistent with this prior, but without including any data, and the lower panel 
shows examples of models that are allowed by both the prior and the data assumed in the fiducial 
program. 
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where Aw sets th e amplitude of allowed w (z) variations and Aoc is the correlation length. Following 
the calculation in ICrittenden et al.l (|2009l ). the covariance matrix for the Wi bins, which is the inverse 
of the prior Fisher matrix for those parameters, is 



rppriori-l 

^ ij J(i,j<36) 



(AwfAac 



„ x+ tan ^ x+ + x_ tan ^ x- — 2x tan x + In 

^ /{l + xl){l + x^ 

(177) 

where x = \i — j\Aa/ Aac, x+ = (|i — j| + l)Aa/Aac, and x- = {\i — j\ — l)Aa/ Aac- In the 
limit Aac — ^ 0, this reduces to our default diagonal prior on the Wi parameters with width = 
Aw/VAa. 

The upper right panel of Figure[l2lshows models randomly drawn from this prior with Aw/V Aa - 
1 and Aoc = 0.2. The influence of the correlation function is clearly evident in the smoother, lower- 
amplitude variations of w{z) in these models, and yet the range of possible models is still much 
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greater than for simpler parameterizations like WQ~Wa- Combining this prior with the assumed 
data set of the fiducial Stage IV program, we obtain the w{z) realizations plotted in the lower right 
panel of Figure [HJ Even more so than averaging w{z) in wide redshift bins, including this type of 
prior significantly narrows the constraints on w{z). While the particular smoothness prior of (I176p 
is certainly not unique, this approach of combining PC constraints dictated by the data sets with 
theoretically motivated priors on the behavior of 'w{z) — perhaps based on an underlying model 
for the potential 1^(0) — may be the most valuable application of the PC approach. 

Our constraints on general w{z) models account for the possibility of modified gravity by 
marginalizing over the structure growth parameters A7 and In Gg . If we instead restrict our anal- 
ysis to GR by fixing A7 = InGg = 0, the main effect is that the dark energy equation of state at 
high redshifts, w{3 < z < 9), is better constrained because the CMB measurement of the power 
spectrum amplitude at z ~ 1000 can be more directly related to WL measurements of growth at 
lower redshifts. Because of the additional CMB constraint on the distance to the last scattering 
surface, w{3 < z < 9) is strongly correlated with O^,, and therefore assuming GR considerably im- 
proves the determination of spatial curvature in the binned w{z) parameterization. For our fiducial 
forecasts, assuming A7 = InGg = lowers by a factor of ~ 3 (0.0075 — )■ 0.0023); note that 
this is still several times larger than the error in Qi^ for the simpler ACDM or vjQ-Wa forecasts. 

8.4- Forecasts for Clusters 

We have concentrated so far on the constraints expected for combinations of CMB, SN, BAO, 
and WL data, as all of these methods are well studied and are likely to play a central role in Stage III 
and Stage IV studies of cosmic acceleration. For other methods we adopt a simplified approach, first 
asking how well our fiducial CMB+SN+BAO+WL programs should predict the basic observables 
of these methods, then showing how different levels of precision on these observables would affect 
constraints on equation-of-state and growth parameters. We describe our methodology more fully 
in the next section ( §8.5p . but we begin with a discussion of clusters, where our analysis of stacked 
weak lensing calibration ( ^6.3.3p gives a clear quantitative target for measurement precision. 

Figure H3h shows the predicted fractional error {la) in 178(2;) for the fiducial Stage III and Stage 
IV experimental programs discussed in ^8.31 and for the Stage IV program with optimistic WL 
errors. All curves assume wq — Wa dark energy parameterization, and for each case the lower, 
thinner curve shows the forecast assuming GR to be correct while the upper, bolder curve allows GR 
deviations parameterized by Gg and A7. Roughly speaking, we would expect a measurement with 
precision better than that shown by the upper curve to significantly improve tests for GR deviations 
and a measurement with precision better than that shown by the lower curve to significantly improve 

— Wa constraints when assuming GR to be correct. For Stage IV programs we predict a^{z) 
constraints at the 0.75 — 1% level over the full redshift range < 2 < 3, with little difference 
between the fiducial and optimistic WL assumptions. In fact, the "optimistic" WL assumptions 
lead to slightly larger errors in a^{z) than the fiducial assumptions because for this quantity doubling 
the statistical errors has a larger impact than adding 2 x 10~^ shear calibration and photo- 2: errors 
(see ^S.ip . For Stage III, the predicted (t%{z) errors are about 1.2% assuming GR, but they are much 
larger if we allow GR deviations, especially at z > 0.8. Even for Stage IV, the good constraints at 
high z rely on the assumption of a wq — Wa equation of state, which allows precise low redshift WL 
measurements to be extrapolated to high redshift. The direct measurements of z > 1 clustering 
amplitude are considerably weaker. 

Figure l43b plots cTii^abs(-2) errors, which are tighter than the (7s{z) errors by ~ 30 — 50% because 
uncertainty in h contributes noticeably to the latter. In §6.61 we estimated the errors on (Tii^abs(-2;) 
achievable with a lO'^ deg'^ cluster survey with weak lensing mass calibration, assuming Stage III 
(lOarcmin"^) or Stage IV (30arcmin~^) effective source densities and survey depths. For a mass 
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Figure 43 (a) Predicted fractional errors (la) on as{z) from our fiducial Stage III (dotted) and 
Stage IV (solid) CMB+SN+BAO+WL programs, and from the Stage IV program with optimistic 
WL systematic assumptions (dashed). All curves assume a wq — Wa dark energy parameterization. 
For each case, the lower, thin curve shows the forecast assuming GR is correct and the upper, thick 
curve shows the forecast allowing GR deviations parameterized by Gg and A7. (b) Like (a), but 
for <Tii^abs(2), the rms matter fluctuation in spheres of radius llMpc (instead of 8h^^ Mpc). (c) 
Like (b), but for the parameter combination (Tii^abs(-2)f^m^ that approximates the quantity best 
constrained by cluster abundances, (d) Predicted fractional errors in (Tii^abs(-z)^^m^ from cluster 
abundances in a 10^ deg^ survey calibrated by stacked weak lensing mass estimates with Stage III 
{ncs = lOarcmin"^) and Stage IV {rics = 30arcmin~^) source surface densities and survey depths 
(dotted and solid curves, respectively). From top to bottom, curves correspond to cluster mass 
thresholds of 8, 4, 2, and 1 x lO^^M©. 

threshold of 2 x IO^'^Mq the (Tii^abs(-z) errors at z ^ 0.5 are ~ 1% and ~ 0.5%, respectively, below 
the corresponding Stage III and Stage IV errors shown in Figure 143b . Furthermore, these cluster 
errors are per Az = 0.1 redshift bin, so constraints on the clustering amplitude in a smoothly 
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Figure 44 Predicted constraints (Icr) on the equation-of-state parameters wo,^ = w{z = 0.5) 
and Wa (left panels) and the growth parameters Gg and A7 (right panels) from our fiducial 
CMB+SN+BAO+WL programs combined with cluster abundance measurements of (Tii^abs(-2)f^m^ 
with the precision shown in Fig. HSt l. Top panels show Stage III clusters with Stage 111 
CMB+SN+BAO+WL, while middle and bottom panels show Stage IV clusters combined with 
the fiducial and WL-opt Stage IV programs, respectively. Note the change in axis scale between 
the top and middle/lower panels. In each panel, the outermost contour shows the constraints with- 
out clusters, and the remaining contours show the constraints for cluster mass thresholds of 8, 4, 
2, and 1 x 10^^ Mq (outer to inner). 
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evolving model can be substantially better if the cluster errors are not correlated across redshifts. 
(The statistical errors should be uncorrelated, but some forms of weak lensing systematics could 
affect many redshift bins in the same direction.) 

The cluster errors shown earlier in Figure [281 were derived assuming perfect knowledge of 
with crii^abs(-2) as the single parameter controlling the cluster abundance at each redshift. In prac- 
tice, cluster abundances constrain a parameter combination that is approximately a"ii^abs(-2)^^m^, as 
discussed in ^ The fractional errors in from our fiducial CMB+SN+BAO+WL programs are 
2.7% (Stage III), 1.4% (Stage IV), and 1.2% (Stage IV with WL-opt), making OP^^ uncertainties 
comparable to the fractional errors in cth abs(-2)- Figure H3b shows the predicted fractional errors 
in crii_abs(-z)^^m'^) which in some ranges are significantly larger than those for crii^abs(2). Finally, 
Figure H3li shows our forecast errors on (Tu^absi^)^^ from a 10^ deg^ cluster survey in which errors 
are limited by weak lensing mass calibration statistics. Here we have simply set the fractional errors 
from clusters on (yu^ahsi^)^'^ equal to the ones we derived earlier on o"ii^abs(-z)) which should be 
a good but not perfect approximation. Comparing Figures H3b and SSjl shows that cluster errors 
are competitive with those expected from the CMB+SN+BAO+WL combination for cluster mass 
thresholds of ~ 4 - 8 x 10^^ Mq at Stage III or 1 - 4 x 10^^ at Stage IV. 

Figure HH shows the potential improvement in equation-of-state and growth parameter determi- 
nations from including the cluster constraints on (yu^ahsi^)^^ ■ We assume that these constraints 
have independent errors in each Az = 0.1 bin. Upper panels show the effect of adding Stage III 
cluster constraints (dotted curves in Fig. H3li) to the Stage III CMB+SN+BAO+WL Fisher ma- 
trix. Even adding clusters with an 8 x lO^^M© mass threshold substantially improves the errors 
on Gg and A7, and reducing the mass threshold to 1 — 2 x IO^^Mq produces substantial further 
gains. Somewhat surprisingly, the cluster constraints also lead to significantly smaller errors on 
the equation-of-state parameter wq.s and slightly smaller errors on Wa- This improvement largely 
reflects the additional information about f^m; which allows the distance and H{z) constraints from 
other probes to translate more directly into tf(z) constraints. We have checked that fixing 
exactly would produce a still greater improvement in {wQ,^,Wa) than the gain we have forecast from 
clusters, while making little difference to the (Gg, A7) errors. 

For Stage IV (middle and bottom panels), where we now assume the Stage IV cluster mass 
constraints, an 8 x IO^'^'^Mq cluster sample produces little improvement over CMB+SN+BAO+WL 
in Gg and A7, but it still leads to noticeable improvement in 1^0.5- A 1 — 2 x IQ^^Mq cluster 
sample produces substantial gains in both the equation-of-state and growth parameters. As in the 
Stage III case, much of the improvement in the equation of state comes from the Vim information 
provided by clusters. However, the cluster constraints reduce the t^o.s error even if Vtm is held 
fixed, so some of this improvement arises from another source, probably by allowing some WL 
information to be effectively transferred from growth to distance. Adding our Stage IV, lO^^M© 
cluster constraint to the fiducial Stage IV program increases the DETF FoM from 664 to 1258, and 
it increases the modified FoM [a{wp)a{wa)]'^ x [0.034/cr(A7)] (Figure [32]) from 664 to 1955. For 
the WL-opt program, the improvements are 789 — )• 1363 and 1037 — )• 2380, respectively. For Stage 
HI CMB+SN+BAO+WL, adding Stage HI clusters leads to improvements of 131 183 (FoM) 
and 30 137 (modified FoM). 

Our treatment here is simplified because we have ignored the impact of volume-element changes 
on the cluster abundance and have set the scaling index of o"ii^abs(-2)f^m to a constant value q = 0.4 
instead of including its redshift and mass dependence. More importantly, we have assumed that 
errors in the cluster abundance will be dominated by the statistical errors in the weak lensing 
calibration of the mean mass scale, not increased by marginalizing over uncertainties in mass- 
observable scatter, incompleteness, contamination, or theoretical predictions. The effective mass 
calibration uncertanties we are assuming are those in Figure [26l These are probably pessimistic 
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at z > 1, where the weak lensing calibration error exceeds 10% but one could probably use other 
calibration methods (including direct comparison to theory) to do better; thus, we are underplaying 
the potential contribution of hig h -redsh ift clusters. Our approximate calculations confirm the 
conclusions of Qguri and Takadal Jmll) that clusters calibrated with stacked weak lensing can 



make an important contribution to testing cosmic acceleration models, even in the era of Stage 
IV dark energy experiments. Figure H3] also provides a target for other methods of measuring the 
matter clustering amplitude, such as the Lya forest ( ^7.6p . 



8.5. Forecasts for Alternative Methods 

We now turn to some of the alternative probes discussed previously in ^ For each technique, 
we first focus on the question of complementarity with the primary methods by asking how well 
the observable quantity measured by a particular technique is already known given the fiducial 
combination of SN, BAO, WL, and CMB data. These predictions provide benchmarks that any 
additional measurement must reach in order to contribute significantly to constraints on dark energy 
or modified gravity parameters. In many cases, the precision of the predictions depends strongly 
on the chosen parameterization of deviations from the standard paradigm of ACDM and GR. We 
will generally assume a vjQ—Wa model for the results in this section, but we note that if one adopts 
a more general parameterization of dark energy, the predictions are generally weaker and thus the 
value of alternative probes is potentially greater. 

The covariance matrix for a set of observables X measured by a particular alternative probe can 
be computed straightforwardly using the covariance matrix of the cosmological parameters given 
by the inverse of the total Fisher matrix for SN, BAO, WL, and CMB data, 

where p is either the full set of parameters in eq. (116ip or the reduced set with wq and Wa replacing 
the 36 Wi bins; in the latter case, F is the Fisher matrix for the WQ-Wa parameterization computed 
using eq. ()169p . We compute the full covariance matrices for the alternative methods, but the plots 

in the following sections only show the predicted uncertainties = \J and do not reflect the 
fact that errors on the observables may be correlated. 

In addition to computing how well the fiducial SN, BAO, WL, and CMB constraints predict 
each observable that would be measured by the alternative techniques, we provide several examples 
to show the improvement in the FoM and other parameters that would result from a specific 
measurement of that observable. For these tests, we only consider the impact of measurement of a 
single quantity AT at a time, so the total Fisher matrix is modified simply by adding the term 



- 5^ ' ^^^^> 



where ax is the assumed uncertainty in the measurement of X. 



8.5.1. The Hubble constant 

For the Hubble constant, the predicted uncertainty from the fiducial probes is simply the value 
of afi that comes out of the Fisher matrix forecasts of the previous section. Assuming a WQ-Wa dark 
energy model, the expected precision on Hq is 0.7% for the fiducial Stage IV forecasts and small 
variations of those forecasts, and 1.3% for Stage III (see Tables [S HlO]) . These are challenging, but 
probably attainable, targets for future efforts to independently measure Hq. 
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Figure 45 Dependence of the DETF FoM on the accuracy of additional measurements of the Hubble 
constant for Stage III and IV forecasts from ^8.31 The fiducial Stage IV program with FoM= 664 
is marked by an open circle. 

In Figure W5\ we show the effect on the DETF FoM of adding a prior on Hq to the fiducial Stage 
III and IV forecasts. In all cases, adding a prior with precision that matches the uncertainty one 
would have in the absence of the prior increases the FoM by ~ 40%. The uncertainties in other 
parameters are affected little by the inclusion of an independent Hq measurement, as discussed in 

ED 

For a more general dark energy parameterization such as the binned Wi values, predictions 
for ah can be orders of magnitude weaker than they are for WQ-Wa or ACDM (see Figs. [36H37|l . 
In this case an independent, local measurement of Hq is vital for accurate determination of the 
Hubble constant. However, Hq priors do not significantly improve dark energy constraints in this 
case; an Hq constraint limits the range of w{z) in the lowest-redshift bin, but since w{z = 0) is 
only weakly correlated with the equation of state at higher redshifts by SN, BAO, WL, and CMB 
data, the impact of an additional Hq measurement on the equation of state at z > is small. 
The improvement in the DETF FoM in Fig. US] is largely a consequence of the restrictions that 
the WQ-Wa parameterization places on the evolution of w{z) between z = and higher redshifts. 
Of course, a discrepancy between directly measured Hq and a, wq — Wa prediction would already 
provide the crucial insight that wq — Wa is inadequate; it just wouldn't give further direction about 
the evolution of w{z). 

8.5.2. The Alcock-Paczynski Test 

For the AP test ( ^7.3p . we consider the observable H{z)Da{z). Since Stage IV BAO data provide 
tight constraints on both H{z) and Da{z), which are further strengthened by the SN, WL, and 
CMB measurements, it is not surprising that the product H[z)Da{z) is predicted very precisely in 
the combined forecasts. The left panel of Figure H6] shows that the uncertainty in the AP observable 
is ~ 0.2% at < z < 3 for Stage IV data, and it is still predicted to sub-percent accuracy with Stage 
HI data. Independent measurements of the AP observable that are significantly less precise than 
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Figure 46 Left: Predicted fractional error (la) of the AP parameter H{z)Da{z) from our fiducial 
Stage III and Stage IV CMB+SN+BAO+WL programs, assuming a WQ-Wa dark energy parame- 
terization. Right: Dependence of the DETF FoM on additional measurements of H{z)Da{z) at a 
single redshift. For each forecast, the three curves from top to bottom assume AP measurements 
at z = 0.5, z = 1, and z = 2, respectively. 

these predictions would contribute little to cosmological constraints. Note that these results are for 
a wo~Wa dark energy model. If we instead use independently-varying w{z) bins, the uncertainty in 
the AP observable for the Stage IV forecasts increases to~l%atl<z<3 and becomes much 
larger at both lower and higher redshifts, although the exact precision of the predictions in this 
case depends strongly on the detailed forecast assumptions such as the prior on Wi in each bin or 
the number of bins used for BAO data. 

In the right panel of Fig. |46| we show the improvement in the DETF FoM (assuming the wq- 
Wa parameterization) when various measurements of the AP observable are added to the fiducial 
Stage III and IV forecasts. Since the predictions for H{z)Da{z) are weakest at z < 0.5, a direct 
measurement of the AP observable at those redshifts has a greater impact on the FoM than mea- 
surements at higher redshiftslll A 1% measurement of H[z)Da{z) at z = 0.5 increases the Stage 
III FoM by about 13%; a similar improvement in the Stage IV FoM requires an accuracy of 0.5% 
at the same redshift. While the demands suggested by Figure H6l appear stiff, large redshift surveys 
in principle have the information to achieve very high precision on H{z)Da{z). The challenge is 
lowering systematics to the level needed to achieve this precision. 

8.5.3. Redshift- space Distortions 

For redshift-space distortions (RSD; ^7.2p . the relevant observable is a8{z)f{z). WL data pro- 
vide some limits on this observable by constraining the structure growth parameters A7 and (in 
combination with the CMB) Gg, and through their constraints on the expansion history all of the 



^^Note, however, that either decreased SN errors or increased BAO errors for any of these forecasts would reduce 
the difference between the predictions at z < 1 and at z > 1. 



195 




Figure 47 Left: Predicted fractional error (la) of the RSD observable a^{z)f{z) from our fiducial 
Stage III and Stage IV CMB+SN+BAO+WL programs, assuming a WQ-Wa dark energy parameter- 
ization. The lower, thin curves for each forecast additionally assume GR by fixing A7 = InGg = 0. 
Right: Improvement in the la uncertainty on A7 from an additional RSD measurement at a single 
redshift. For each forecast, the lower and upper curves assume RSD measurements at z = 0.2 and 
z = 1, respectively. 

acceleration probes contribute indirectly to the predicted growth history. The resulting predictions 
for Stage III and IV programs are plotted in the left panel of Figure |47l We show predictions 
both for the general case where we marginalize over the structure growth parameters and for GR 
(A7 = lnG9 = 0). 

With the assumption of GR, the RSD observable is predicted to 1-2% accuracy for Stage III 
and 0.5-1% accuracy for Stage IV. If we allow modifications to GR through A7 and Gg, however, 
the uncertainty at z < 1 increases dramatically. This change is mainly tied to the freedom to alter 
the growth rate f{z) at low redshift by varying A7. Note that the effect of A7 vanishes at high 
redshift because ^miz) approaches unity and therefore f{z) — )• /gr(-z) (see equation 05]) . At 2; > 2, 
uncertainty in Gg significantly weakens Stage III predictions of the RSD observable, but the effect 
on Stage IV predictions is much smaller. 

The DETF FoM can be improved by the addition of precise RSD measurements if we assume 
GR; for example, the fiducial Stage IV (Stage III) FoM increases by ~ 10-15% with a 1% (2%) RSD 
constraint at z = 1. Without assuming GR, the additional information from an RSD measurement 
at a single redshift goes mainly into constraining the structure growth parameters (and thus testing 
GR). In this case, the FoM improvement from percent-level RSD constraints is < 10%. However, 
percent-level measurements in several redshift bins can still have an important impact on the FoM. 

Low-redshift measurements of the RSD observable can contribute significantly to constraints 
on A7, as shown in the right panel of Fig. UT) For Stage III forecasts, 1-2% RSD measurements 
at z = 0.2 reduce the error in A7 by nearly an order of magnitude, reaching an uncertainty 
comparable to that expected from the Stage IV probes. Likewise, the Stage IV constraint on A7 
can be improved by a factor of a few by the addition of percent-level RSD measurements. At 
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Figure 48 Predicted fractional error (la) on the distance Dj^{z) from our fiducial Stage III and 
Stage IV CMB+SN+BAO+WL programs and the Stage IV program with optimistic weak lensing 
assumptions. For each case, the lower, thin curve assumes a WQ — Wa dark energy parameterization, 
while the upper, thick curve represents our binned w{z) model. 

higher redshifts, the impact of RSD observations on the A7 uncertainty is greatly reduced due to 
the diminishing effect of A7 on the growth rate at high z. This reduced sensitivity at high z is in 
some sense an artifact of the A7 parameterization; the error on A7 is larger than the error on In / 
by a factor \(dln f /dA^)^^\ = \ [lnQmiz)]~^\, which for the fiducial cosmological model of Table[7] 
is 1.02 at z = 0.2 [where fim(z) = 0.373] and 3.23 at z = 1 [where Qmiz) = 0.734]. 

We have computed but not plotted the impact of RSD measurements on the growth normaliza- 
tion parameter Gq. For Stage IV, the uncertainty in Gg is little affected by adding RSD measure- 
ments at any redshift. For Stage III, 1-2% measurements of a^{z)f{z) can reduce the fractional 
error in Gg by up to a factor of two. 

8.5.4- Distances 

As a target for alternative distance indicator methods ( ^7.4p and standard sirens ( N7.5p . Fig- 
ure 38] plots the predicted fractional error on the angular diameter distance from our fiducial Stage 
III and Stage IV CMB+SN+BAO+WL programs. If we assume a wq — Wa model then the con- 
straints are tight, better than w 0.25% for Stage IV and 0.5% for Stage III at all z > 0.5. 
However, the highest redshift distance measurements included in our forecasts (other than CMB) 
are BAO measurements at z = 3, so when we change to our general 'w{z) model the distance errors 
at z > 4 become dramatically worse, ~ 2% for Stage IV and ~ 8% for Stage III. Furthermore, 
our Stage III forecast assumes a 0.8% distance measurement from HETDEX at 2; = 2.4, which 
we consider somewhat optimistic because it assumes that the full power spectrum shape can be 
used rather than the BAO scale alone. At z < 2 the Stage III curve has a jagged structure that 
depends to some degree on the specific choices we have made in binning and in assigning BAO /WL 
measurements to particular redshifts. 

The message to take away is that Stage III distance errors for the general w{z) model should 
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be in the 1 — 2% range at z < 1, the 3 — 5% range at 1 < z < 2, and the 6 — 8% range at z > 4, 
with the errors at 2 < z < 4 depending on the strength of BAO measurements from Lya emission 
hne galaxies (HETDEX) or the Lya forest (BOSS). On the Stage III timescale, alternative distance 
measurements at z > 1 with few percent precision could reveal otherwise hidden departures from 
the wq — Wa model. For Stage IV, where we assume powerful BAO experiments extending to z = 3, 
the demands on alternative distance indicators are much stiffer. Even for the general w{z) model, 
alternative measures at z > 4 must reach 2% precision to be competitive. The Stage IV distance 
errors in this model become large at z < 0.25, similar to the several percent errors in Hq seen 
in Figure [371 As already discussed in ^8.5.11 precise low redshift distance measurements have the 
potential to reveal late-time departures from smooth w{z) evolution. 

8.6. Prospects with Many Probes 

Section [8?3] demonstrates the power of a combined CMB+SN+BAO+WL experimental program, 
but ^ ^8.4118.5] show that other probes could add substantial further sensitivity to dark energy or 
modified gravity. As a concluding plot, we show in Figure H91 the result of combining our fiducial 
CMB+SN+BAO+WL programs with representative performance estimates for clusters, redshift- 
space distortions, and direct Hq measurement. (While the AP test could also play an important 
role, we consider current understanding of its systematic uncertainty too limited to allow even 
representative performance estimates.) Top, middle, and bottom panels show inverse errors on 
Wp, Wa, and A7, respectively, assuming a wq — Wa model with Gg and A7 as beyond-GR growth 
parameters. Black bars show the results of combining all of these probes, while colored bars show 
the cumulative impact of successively omitting individual probes (see further explanation below). 

For Stage IV we assume our fiducial CMB, SN, BAO, and WL constraints, and we include 
cluster (CL) constraints as discussed in §8.4| assuming a cluster sample that reaches to 2 x lO^^M© 
over 10^ deg^, limited by the mass calibration uncertainty from stacked weak lensing with a source 
density of 30 arcmin"^ (see §6.3.3p . For Hq we assume a pre cision of 1% . For redshift-space 
distortions (RSD), we take the forecast errors on 

/(z)cJ8(z) that IWhite et all h00§) present for a 



Euclid/ WFIRST like survey (see ^7:2]) . For Stage III we adopt the CMB+SN+BAO+WL errors 
summarized in Table [71 (Note, in particular, that our assumed Stage III SN errors are 0.02 mag 
per Az = 0.2 bin and our Stage IV errors are 0.01 mag per bin, with the same error for the local 
calibrator sample at z = 0.05.) For Stage III clusters we assume 5000 deg^ and a source density of 
10 arcmin"^ for mass calibration (both appropriate to DES), while keeping the rnass th reshold at 
2 X 10^^ Mq. For Hq we assume 2% errors, and for RSD we take the White et al. ( 20081 ) forecasts 



for BOSS. Finally, for current data we take WMAP CMB errors, Union2 SN errors, and the BAO 
data and errors described in §4.21 We adopt the RSD errors reported by iBlake et al.l (I2OI1I ) from 



WiggleZ (see ^7.2p . We also include a 3% error on Hq and a 4% error on as^m^''^ to represent 
clusters and weak lensing (see §6.2p . 

Beginning with the black bars representing the full combinations, we see that these projections 
predict improvements of more than an order-of-magnitude for each of the three parameters — Wp, 
Wa, and A7 — between current knowledge and Stage IV results. These combinations yield la 
errors of approximately 0.005 on Wp, 0.1 on Wa, and 0.01 on A7, testing the ACDM model far more 
stringently than it has been tested to date. Stage III projections are roughly the geometric mean 
of current and Stage IV constraints in all cases. 

It is interesting to ask what the different methods contribute to this performance, but there is 
no unique way to decompose a constraint into a sum of individual contributions, and the apparent 
relative importance of different components depends on how the decomposition is done. We have 
attempted one form of "even-handed" decomposition by dropping individual probes in succession, 
beginning with the probe whose omission causes the largest increase in the parameter error, then 
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Figure 49 Forecasts for inverse la errors (left axis; errors themselves on the right axis) on Wp, Wa, 
and A7 from combining our fiducial CMB+SN+BAO+WL programs with additional constraints 
from redshift-space distortions (RSD), clusters (CL), and direct Hq measurements. See text for 
a description of the errors assumed for Current (left), Stage III (middle), and Stage IV (right) 
forecasts. Black bars show the results of combining all of these probes. Colored bars show the 
cumulative impact of dropping probes in succession ("— BAO" should be read as "minus BAO," 
for example). When a Stage IV probe is "dropped" it is set to its Stage III precision, and when a 
Stage III probe is "dropped" it is set to its current precision. CMB constraints are always retained. 

the probe that causes the largest increase after the first probe has already been dropped, and so 
forth. However, when we "drop" a probe we do not omit it entirely; rather, we set the error for that 
probe in the Stage IV forecast equal to the value we previously assumed for the Stage III forecast, 
or we set the error in the Stage III forecast equal to the value adopted for current data. Thus, 
the dark green bar in the upper right shows the impact on a{wp) of replacing the Stage IV BAO 
constraints with the Stage III BAO constraints. The light green bar next to it shows the impact of 
also setting the RSD constraint to the Stage III value, the dark blue bar the impact of also setting 
the WL constraint to the Stage III value, and so forth. To give one more example, the light blue 
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bar in the middle of the bottom panel shows the error on A7 using Stage III WL+SN+BAO+f^o 
but current constraints for RSD and CL. We always include CMB constraints, with WMAP9 errors 
for current and Planck errors for Stage III and Stage IV. By construction, the rightmost colored 
bar for a given stage matches the black bar of the previous stage, since we have then set all probes 
back to their value in the previous stage. 

We caution against reading too much into the ordering of probes in Figure HOlbecause it depends 
in detail on our assumptions; the detailed examination of CMB+SN+BAO+WL in Tables [SifTOl 
and the associated figures provides much more nuanced information. If we assumed Stage IV SN 
errors of 0.005 mag instead of 0.01 mag, or if we adopted our optimistic WL systematics forecast, 
then the predicted parameter errors would decrease and these probes would play a stronger role. 
Furthermore, a probe only gains in this plot based on its differential improvement between current 
performance and Stage III or between Stage III and Stage IV. 

These caveats notwithstanding, Figure 09] demonstrates several interesting points. In present 
data, SNe dominate constraints on the equation-of-state parameters, with BAO alone providing 
much weaker constraints. BAO become much more powerful in our fiducial Stage III and Stage IV 
programs, making the largest contribution to the Wp and Wa constraints. Current constraints on 
A7 rely ent irely on RSD , as th e cluster constraint on cr^Q^ is degenerate with Gg. With the errors 
forecast by White et al. ( 20081 ). RSD remains the most powerful contributor to A7 constraints at 
Stage III and Stage IV, outweighing both WL and clusters. Indeed, with these errors Stage IV RSD 
also makes an important contribution to the Wp measurement. WL and clusters make significant 
contributions to A7 constraints but have weak impact on Wp and Wa- Perhaps the most important 
message to take from Figure H9l is that these six probes together with CMB measurements provide 
a tight web of constraints on cosmic acceleration models, and that even if one or two methods 
prove disappointing, there are others (including ones not shown in this plot) to take up slack. We 
have focused much of our review on the stiff challenges of controlling systematic errors at the level 
demanded by future dark energy experiments. However, given the ingenuity of the community in 
devising and refining analysis methods, we are optimistic that the powerful data sets provided by 
these experiments will ultimately lead to constraints at the high end of our forecasts. 
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9. Conclusions 



The first evidence for dark matter emerged from studies of galaxy clusters in the 1930s (jZwickvl . 



19331 ) , and the dark matter problem assumed a central position in cosmology after technological ad- 



vances allowed dynamical nieasur ements in the o uter regions of individual g alaxies (jRubin and Ford 
1970l : iRogstad and Shostaklll972l see review bv lFaber and Gallagherlll979l ). Because it clusters on 



small scales, dark matter has a rich phenomenology, and detailed studies of galaxies, galaxy clus- 
ters, large scale structure, the Lya forest, and the CMB have largely pinned down its properties 
even though we have yet to identify the dark matter particle or particles. The implications of the 
dark matter problem have proven even more profound than might have been imagined in the 1930s, 
pointing the way to an entirely new form of matter whose cosmic mean density exceeds that of 
all baryonic material by a ratio of 6:1. There are now several plausible ideas of what dark matter 
might be — ideas that are rooted in well motivated extensions of the standard model of particle 
physics and that (at least in some cases) naturally explain the observed density of dark matter 
(|Bertone et"aD . l200,^ ). With experimental methods advancing on many fronts, there are good rea- 
sons to hope that dark matter will soon be identified in particle accelerators, detected directly 
in underground experiments, or detected indirectly via its annihilation into 7-rays, neutrinos, or 
cosmic rays. 

Evidence for cosmic acceleration began to emerge in the early 1990s, and it rapidly evolved into 
a near-airtight case following the supernova discoveries of the late 1990s (see ^l.ip . Whether the 
cause is a new energy component or a breakdown of GR, the implications of cosmic acceleration 
are dramatic, even more so than those of dark matter. Cosmic acceleration may ultimately provide 
clues to the nature of quantum gravity, or to the structure of the universe on scales beyond the 
Hubble volume, or to its history over times longer than the Hubble time. There are already many 
theories of cosmic acceleration, but none of them offers a convincing explanation of the observed 
magnitude of the effect, and nearly all of them were introduced to explain the observed acceleration, 
rather than emerging naturally out of fundamental physics models. In contrast to dark matter, most 
models of dark energy predict that it is phenomenologically poor, affecting the overall expansion 
history of the universe but little else. That impression could yet prove incorrect: other signatures of 
"cosmic acceleration physics" might appear in small-scale gravitational experiments, in the behavior 
of gravity in different large scale environments, or in non-gravitational interactions. 

While the solution to the cosmic acceleration problem could come from a suprising direction, 
including theory, there is a clear experimental path forward through increasingly precise measure- 
ments of expansion history and growth of structure. Relative to current knowledge. Stage IV 
experiments can improve the measurement of basic cosmological observables — H{z), D{z), and 
G{z) — by one to two orders of magnitude. Correspondingly, they can achieve a 1 — 2 order-of- 
magnitude improvement in constraints on w, 2 — 3 orders-of-magnitude improvement in the DETF 
figure-of-merit, and still greater gains in higher dimensional parameterizations, including tests of 
GR violations. Any robust deviation from a cosmological constant model would have profound 
implications, and the greater the precision and detail with which such a deviation is characterized, 
the greater the direction for understanding its cause. 

We have reviewed in considerable detail the four leading methods — supernovae, BAO, weak 
lensing, and clusters — and we have briefly discussed some of the emerging new methods, whose 
capabilities and limitations are as yet less thoroughly explored. We have also investigated the 
complementarity of these methods for constraining theories of cosmic acceleration. We have spent 
little time on the CMB as it has little direct constraining power on these theories, but it does 
provide crucial constraints on other cosmological parameters that are essential to precision tests. 
We now conclude our article with an editorial recap of our main takeaway points. 
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Type la supernovae have unbeatable precision for measuring distances at 2; < 0.5. Future sur- 
veys can readily achieve statistical errors of 0.01 mag or less (0.5% in distance) averaged over bins 
of Az = 0.2. The challenge is getting systematic uncertainties at or below the level of such statis- 
tical errors. In our view, the key systematics for SN studies are imperfect photometric calibration, 
evolution in the population of SNe represented at different redshifts, and the effects of dust extinc- 
tion. The first can be addressed by careful technical design, of the instruments used for SN surveys 
and of the observing and calibration procedures. The second can be addressed by obtaining high 
quality observations of the SNe and their host galaxies that allow one to match the properties of 
high and low redshift systems. The third is best addressed by working in the rest-frame near-IR, 
where extinction is low. Rest-frame IR observations may also mitigate evolution systematics and 
improve statistical errors, since current observations indicate that the scatter in SN luminosities is 
smaller in the near-IR than in the optical. 

The BAO method complements the SN method in several ways. SN measure distance ratios 
relative to local calibrators (i.e., distances in Mpc), while BAO measure absolute distances 
(in Mpc) assuming a calibration of the sound horizon. SN and BAO measurements at the same 
redshift therefore provide complementary information, effectively constraining Hq, which is itself 
sensitive to acceleration when combined with CMB data. Spectroscopic BAO measurements that 
sample a constant fraction of the sky become more precise at high redshift because they cover 
a greater comoving volume and because they measure H(z) directly in addition to Da{z). (Of 
course, they also require a larger number of tracers to probe these larger volumes, and the tracers 
themselves are fainter at higher redshifts.) Cosmic variance limited BAO surveys have roughly 
constant sensitivity to dark energy over the range 1 < z < 3 because the decreasing dynamical 
impact of dark energy at higher redshifts is balanced by the greater BAO measurement precision. 
Furthermore, the BAO method is the only one that we expect to be statistics-limited even with 
Stage IV surveys. Non-linear matter clustering and non-linear galaxy bias may shift the BAO 
peak by more than the statistical errors of Stage IV experiments, but the shifts can be computed 
using theoretical models that are constrained by the smaller scale clustering data, and moderate 
fractional accuracy in these corrections is enough to keep any uncertainty in the corrections well 
below the statistical errors. Thus, we see the main challenge for the BAO method as finding ways 
to efficiently map the available structure. There are several promising ideas, both ground-based 
and space-based, and Stage IV BAO constraints will likely come from a union of several approaches 
covering different redshift ranges. 

Weak lensing measurements provide sensitivity to both the distance-redshift relation and the 
growth of structure. The statistical precision achievable with future facilities is very high, so the 
challenge is reducing systematic uncertainties to a level that does not overwhelm these statistical 
errors. The most important problem is reducing multiplicative shape measurement biases to the 
level of ~ 10~^ or below, which requires (among other things) determining the PSF that affects 
the galaxy images to very high accuracy. This is an area of highly active research, and it is not yet 
clear what approach will prove most STicccssful: we have advocated pursuit of a Fourier method that 
becomes exact in the limit of high S/N ratio. Since most shape measurement systematics depend 
inversely on the ratio r^o/rpsF of galaxy size to PSF radius, one can mitigate these systematics by 
restricting the analysis to larger galaxies, but this gives up statistical precision by reducing the sur- 
face density of usable sources. The second major challenge for WL stTidics is the measurement and 
calibration of photometric redshift distributions, characterizing both their means and their outlier 
fractions at the ~ 10~^ level or below. Meeting this challenge requires optical and near-IR imaging 
for robust identification of spectral breaks, and large spectroscopic calibration data sets. The third 
systematics challenge for WL is intrinsic alignment of galaxies. With continuing theoretical work 
and good photometric redshifts, we believe that this systematic can be kept subdominant, but it 
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remains a challenging problem. WL measurements are rich with observables, including higher or- 
der statistics and varied combinations of galaxy-galaxy lensing, galaxy clustering, and tomography. 
Despite the field's formidable technical obstacles, we think it quite possible that constraints from 
WL surveys will eventually exceed current forecasts because these additional observables provide 
cosmological sensitivity and/or allow systematic uncertainties to be calibrated away. 

Cluster abundance measurements provide an alternative route to measuring the growth of struc- 
ture and thus testing the consistency of GR growth predictions. In addition, by reducing uncertainty 
in and breaking other degeneracies, cluster abundance measurements can sharpen the equation- 
of-state constraints from SN, BAO, and WL distance measurements. The key challenge for cluster 
cosmology is achieving unbiased and precise calibration of the cluster mass scale. Realizing the 
statistical power of future surveys requires absolute mass calibration accurate at the 0.5 — 1% level. 
In our view, this is only achievable with weak lensing, because the baryonic physics associated with 
other observables is too uncertain to predict them this accurately from first principles. We thus 
see cluster studies as a natural byproduct of WL surveys and in some sense as a specialized branch 
of WL, one that takes advantage of the strong additional information afforded by knowing the 
locations of peaks in the optical galaxy density, X-ray flux, or SZ decrement. If WL provides the 
fundamental mass calibration, then the shape measurement and photometric redshift uncertainties 
that affect WL also affect cluster methods. 

While all of these methods can be pursued at ambitious levels from the ground, all would 
benefit from the capabilities of a space mission, especially from the capability of wide-field near-IR 
imaging and spectroscopy, which is possible at the necessary depth only from space. For SN, a 
space platform provides the greater stability and sharp PSF needed for highly accurate photometric 
calibration, and it allows observations in the rest-frame near-IR, which is crucial for minimizing 
extinction systematics and may be valuable for reducing evolution systematics. For BAO, near-IR 
spectroscopy allows emission-line galaxy surveys over the huge comoving volume from 1.2 < z < 2, 
which is difficult to probe with ground-based optical or IR observations. (Intensity-mapping radio 
methods may be able to probe this redshift range from the ground, but this approach still has 
significant technological hurdles to overcome.) For WL, space observations allow the deep near-IR 
photometry that is essential for robust and accurate photometric redshifts, and they provide stable 
imaging with a sharp PSF that enables accurate shape measurements for a high surface-density 
source population. The above considerations motivated both WFIRST and the IR capabilities 
of Euclid. Space-based optical imaging, the other major element of Euclid, allows a significantly 
sharper PSF and thus potentially more powerful WL measurements, if the systematic errors are 
sufficiently well controlled. More generally, space-based WL measurements can employ a higher 
galaxy surface density than ground-based surveys to the same photometric depth, both because the 
PSF itself is smaller and because greater stability and the absence of atmospheric effects should 
allow accurate measurements down to a smaller ratio of rso/rpsF- 

The current generation of "Stage III" experiments such as DES, BOSS, PSl, and HETDEX 
are collectively pursuing all of these methods, and they should achieve dark enery constraints 
substantially better than those that exist today. It is crucial that the next generation, Stage IV 
experiments maintain, collectively, a balanced program that includes SN, BAO, and WL, as well 
as other methods (clusters, Alcock-Paczynski, redshift-space distortions) that can be applied to 
the same data sets. There is much more to be gained, and much lower risk, from doing a good 
job on all three methods than from doing a maximal job on one at the expense of the others. A 
balanced program takes advantage of the methods' complementary information content and areas 
of sensitivity, and it allows the best cross-checks for systematic errors. It is becoming standard 
practice to trade systematic uncertainties for statistical errors by parameterizing their impact and 
marginalizing — e.g., over an uncertain shear calibration multiplier or photometric redshift offset. 
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While this is a powerful strategy for removing biases due to "known unknowns," it does not protect 
against "unknown unknowns." Any conclusion about cosmic acceleration will be more compelling 
if it is demonstrated by independent methods, and the more interesting the conclusion, the more 
crucial this independent confirmation will be. 

In ^ we have provided quantitative forecasts for a fiducial Stage IV program and for many 
variants upon it. Our fiducial SN program assumes 0.01 mag mean errors for a local calibrator 
sample at z = 0.05 and in three bins of Az = 0.2 at 0.2 < z < 0.8, uncorrelated from bin to bin. 
Our fiducial BAO program assumes mapping 1/4 of the sky to z = 3, with errors that are 1.8x 
the linear theory sample variance errors over this volume. Different combinations of redshift range 
and sky coverage that have the same comoving volume yield nearly the same results. Our fiducial 
WL program assumes statistical errors of a ~ 10^-galaxy imaging survey (more precisely, 10^ deg'^ 
with 23 galaxies/arcmin'^), and systematic errors of 2 x 10~^ in shear calibration and photometric 
redshift calibration. We also consider an optimistic case in which the total (systematic + statistical) 
errors are simply double the statistical errors, which effectively corresponds to total errors ~ 2 — 3 
times smaller than those of the fiducial case. Our fiducial program corresponds fairly closely to 
the one recommended by the Astro2010 Cosmology and Fundamental Physics panel, and it is a 
reasonable, probably conservative forecast of what could be achieved by a combination of LSST, 
Euclid/ WFIRST, and ground-based BAO and SN surveys. 

To quantify the expected performance of this program and its variants, we considered two dark 
energy models, one with Wa = wq + Wai^l — a) = Wp + Wp{ap — a), where Op = (1 + Zp)~^ is the 
expansion factor at which w is best constrained, and one with w{a) allowed to vary freely in each of 
36 bins of Aa = 0.025, reaching to z = 9. In both cases we allowed deviations from GR-predicted 
growth rates characterized by an overall multiplicative offset Gg in G{z) and by a shift A7 in 
the logarithmic growth rate dlnG/dlna oc [Om(a)]'^'*'^'^- We focused principally on the expected 
errors in Wp, Wa, A7, and Gg, including the DETF FoM defined as {crwp(Twa)~^ ■ While principal 
components (PCs) of the general w{a) model allow a much richer characterization of the dark 
energy history (and its uncertainties), we regard the combination of the DETF FoM and the A7 
error to be as good as any alternative for characterizing the strength of a combined program. 

The primary results of our forecasting investigation appear in Tables [HVllOl and, in distilled form, 
in Figures [301 and [35l The FoM of our fiducial program is 664, more than five times better than our 
Stage III forecast and a roughly 50-fold improvement on current knowledge. Within the adopted 
parameterization, la errors on individual parameters are 0.014 on Wp, 0.11 on Wa, 0.034 on A7, 
0.015 on InGg, 5.5 x 10~^ on fi^, and 5.1 x 10^^ on h. All three methods contribute significantly 
to these constraints. For our fiducial assumptions, BAO have the greatest leverage on the DETF 
FoM, in the sense that halving the BAO errors produces the greatest increase in the FoM while 
doubling the BAO errors produces the greatest decrease. WL has the least leverage, which implies 
that the fiducial BAO and SN measurements constrain the expansion history well enough that 
the WL measurements add relatively little constraining power. However, the error on A7 scales 
nearly linearly with the WL errors, since all of the information on growth comes from the WL 
measurements. (Note that we scale the total WL errors, equivalent to multiplying systematic and 
statistical errors by the same factor.) Conversely, changing the SN or BAO errors has almost no 
impact on the A7 constraint. 

Changing to our optimistic assumptions about WL systematics (total errors equal to twice the 
statistical errors), while retaining the fiducial SN and BAO assumptions, raises the FoM from 664 
to 789 and lowers the A7 error from 0.034 to 0.026. For the optimistic systematics model, WL 
measurements have the greatest leverage on the DETF FoM instead of the least, and the A7 errors 
continue to scale approximately linearly with the WL errors. Thus, our conclusions about the power 
of WL relative to BAO and SN depend significantly on the assumed importance of WL systematics, 
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which is difficult to predict at present. 

When we move from the wq — Wa model to the general w{z) model, the forecast errors on A7 
barely change, since it is constrained by differential measurements of matter clustering over the 
redshift range of our fiducial data sets. The errors on Gg, on the other hand, expand dramatically, 
because even within GR the overall amplitude of structure can be shifted by the behavior of w{z) 
outside of our constrained redshift range (i.e., at z > 3). If the amplitude of matter clustering 
proved inconsistent with that oi a. wq — Wa, = 1 model, it would definitely indicate something 
interesting, but this measurement alone would not show whether the unusual behavior arises from 
a violation of GR or from unexpected behavior of w{z) at high redshift. 

For variations around our fiducial program, the impact of reducing the errors of SN measure- 
ments is much greater than the impact of increasing the redshift range of these measurements. For 
example, reducing the error per redshift bin from 0.01 mag to 0.005 mag increases the FoM from 
664 to 1197, while increasing the maximum redshift from 0.8 to 1.6 only raises the FoM to 841. 
These scalings imply that the highest priority for SN studies is to minimize statistical and system- 
atic errors at z < 1, and that pushing to higher redshifts is a lower priority until the reduction in 
z <1 systematics has been saturated. At fixed /sky; BAG constraints have a stronger dependence 
on maximum redshift, because at higher z the BAG measurements become more precise and the 
importance of the direct H{z) measurements grows. 

We have not incorporated cluster abundances into our primary forecasts, but we have investi- 
gated how precisely our fiducial Stage III and Stage IV programs (CMB+SN+BAG+WL) predict 
the parameter combination (Tii^abs(-2)f^m^ that is best constrained by cluster abundances. For a 
^^0 — Wa dark energy model, the forecast precision is ~ 1.5% for Stage III and ~ 0.75% for Stage 
IV if we assume GR is correct. If we allow GR deviations parameterized by Gg and A7, then the 
forecast precision degrades significantly, especially for Stage III at z > 0.5. Gur analysis in ^6.3.31 
indicates that clusters calibrated by stacked weak lensing should be able to achieve higher precision 
on crii,abs(-z)^m'^- When we add the anticipated cluster constraints for a 10^ deg^ survey with a 
10^^ Mq mass threshold, assuming that calibration errors are limited by weak lensing statistics, we 
find that the DETF FoM grows by a factor of 1.4 at Stage III and 1.9 at Stage IV relative to the 
fiducial GMB+SN+BAO+WL program. The error on A7 decreases by a factor of 3.2 for Stage III 
and by 1.6 for Stage IV. Gluster studies will be enabled automatically by large WL surveys, which 
can be used to identify clusters as optical galaxy concentrations and to provide mass calibration 
for clusters identified by any method (X-ray, SZ, optical). If they can achieve the limits imposed 
by weak lensing statistics, they can add considerable leverage to tests of dark energy models and 
deviations from GR. 

We have adopted a similar strategy for some of the alternative probes discussed in ^ For a 
WQ — Wa dark energy model, the forecast precision on Hq is 0.7% from our fiducial Stage IV program, 
1.3% for Stage III. A direct measurement of Hq with 1% precision would improve the DETF FoM 
of the fiducial Stage IV program by 20%; a 2% measurement would improve the Stage III FoM 
by 15%. However, the forecast constraint on degrades to ~ 60% in our general w{z) model, 
since large changes in w at low redshift can affect Hq significantly while having minimal impact on 
probes at higher redshift. Thus, a discrepancy between direct measurements and Hq constraints 
from CMB+SN+BAG+WL data could be a diagnostic for unusual low-z evolution of dark energy. 

The Alcock-Pacyznski parameter H{z)Da{z) is constrained to ~ 0.2 — 0.3% by our fiducial 
Stage IV program over the redshift range 0.2 < 2; < 3, setting a demanding target for AP tests. 
The corresponding precision forecast for Stage III is ~ 0.5%. Redshift-space distortions and galaxy 
clustering can measure the parameter combination as{z)f{z), which is constrained by our Stage IV 
fiducial program to about 5% at z ^ 0.1, 2.5% at z = 0.5, and ~ 1% beyond z = 1, numbers that 
improve only slightly for the optimistic weak lensing systematics. For Stage III, the constraints 
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at a given redshift are considerably weaker. In all cases the constraints are tighter if we assume 
GR (A7 = 0, Gg = 1), but the main purpose of redshift-space distortion analyses would be 
to test GR growth, so we regard the looser constraints as the more relevant targets for such 
analyses. This level of precision appears within reach of large galaxy redshift surveys if theoretical 
systematics can be adequately controlled, making redshift-space distortions a potentially powerful 
addition to the arsenal of cosmic acceleration probes. While WL and redshift-space distortions 
both probe structure growth, they have different dependences on the two distinct potentials that 
enter the GR spacetime metric (see §7.7p . so a discrepancy between them could reveal a GR- 
deviation that might not be captured by A7 alone. Galaxy redshift surveys designed for BAO 
measurements should allow redshift-space distortion analyses (and AP tests) as an automatic by- 
product, which may greatly increase their science return. Precise measurements of the shape of 
the galaxy power spectrum could also reveal signs of scale-dependent growth, another possible 
consequence of modified gravity models, though these may be difficult to distinguish from other 
factors that affect the power spectrum shape (see ^7.7p . 

The future of cosmic acceleration studies depends partly on the facilities built to enable them, 
partly on the ingenuity of experimenters and theorists in controlling systematic errors and fully 
exploiting their data sets, and partly on the kindness of nature. The next generation of experiments 
could merely tighten the noose around w = —1, ruling out many specific theories but leaving us 
no more enlightened than we are today about the origin of cosmic acceleration. However, barely a 
decade after the first supernova measurements of an accelerating universe, it seems unwise to bet 
that we have uncovered the last "surprise" in cosmology. Equally important, the powerful data sets 
required to study cosmic acceleration support a broad range of astronomical investigations. These 
observational efforts are natural next steps in a long-standing astronomical tradition: mapping the 
universe with increasing precision over ever larger scales, from the solar system to the Galaxy to 
large scale structure to the CMB. These ever growing maps have taught us extraordinary things — 
that gravity is a universal phenomenon, that we live in a galaxy populated by 100 billion stars, that 
our galaxy is one of 100 billion within our Hubble volume, that our entire observable universe has 
expanded from a hot big bang 14 billion years in the past, that the dominant form of matter in the 
universe is non-baryonic, and that the early universe was seeded by Gaussian (or nearly Gaussian) 
fluctuations that have grown by gravity into all of the structure that we observe today. We hope 
that the continuation of this tradition will lead to new insights that are equally profound. 
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Appendix A. Glossary of Acronyms and Facilities 



Note that we have not repeated the acronyms of X-ray surveys hsted in Table HI 

ACS: Advanced Camera for Surveys (on Hubble Space Telescope) 

ACT: Atacama Cosmology Telescope 

ADEPT: Advanced Dark Energy Physics Telescope 

AP: Alcock-Paczynski 

BAO: Baryon Acoustic Oscillations 

BOSS: Baryon Oscillation Spectroscopic Survey 

BigBOSS: Big Baryon Oscillation Spectroscopic Survey 

CCD: Charge Coupled Device 

CDM: Cold Dark Matter 

CFHT: Canada-France-Hawaii Telescope 

CFHTLS: Canada-France-Hawaii Telescope Legacy Survey 

Chandra: Chandra X-ray Observatory (NASA) 

CHIME: Canadian Hydrogen Intensity Mapping Experiment 

CMB: Cosmic Microwave Background 

COBE: Cosmic Background Explorer 

COSMOS: Cosmic Evolution Survey (from Hubble Space Telescope) 

CSP: Carnegie Supernova Project 

DES: Dark Energy Survey 

DEspec: Dark Energy Spectrograph 

DESTINY: Dark Energy Space Telescope 

DETF: Dark Energy Task Force 

DUNE: Dark Universe Explorer 

EE50: Encircled Energy 50% 

eROSITA: extended Roentgen Survey with an Imaging Telescope Array 
ESA: European Space Agency 

ESSENCE: Equation of State: SupErNovae trace Cosmic Expansion 

Euclid: Euclid dark energy spa ce mi ssion (ESA) 

FKP: Feldman-Kaiser- Peacock 19941 P{k) estimation method 



EFT: Fast Fourier Transform 

FIRST: Faint Images of the Radio Sky at Twenty-Centimeters (from the VLA) 
FWHM: Full Width at Half Maximum 
Gaia: Gaia astrometry mission (ESA) 
GR: General Relativity 

HEAO: High-Energy Astrophysics Observatory (NASA) 

HETDEX: Hobby-Eberly Telescope Dark Energy Experiment 

HOD: Halo Occupation Distribution 

HSC: Hyper-Suprime Camera (for Subaru Telescope) 

HST: Hubble Space Telescope 

IGM: Intergalactic Medium 

IRAC: Infrared Array Camera (on Spitzer Space Telescope) 

ISCS: IRAC Shallow Cluster Survey 

JDEM: Joint Dark Energy Mission 

JEDI: Joint Efficient Dark-energy Investigation 

JWST: James Webb Space Telescope 

KIDS: Kilo-Degree Survey 
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LCS: Light Curve Shape 

LIGO: Laser Interferometer Gravitational Wave Observatory 

LOSS: Lick Observatory Supernova Survey 

LRG: Luminous Red Galaxy 

LSST: Large Synoptic Survey Telescope 

NASA: National Aeronautics and Space Administration 

NOAO: National Optical Astronomy Observatories 

NVSS: NRAO VLA Sky Survey 

Pan-STARRS: Panoramic Survey Telescope and Rapid Response System 

PCA: Principal Component Analysis 

Planck: Planck CMB satellite (ESA) 

PSl: Pan-STARRS 1 

PSF: Point Spread Function 

PTF: Palomar Transient Factory 

RASS: ROSAT AU Sky Survey 

RCS: Red-Sequence Cluster Survey 

ROSAT: Roentgen Satellite 

SDSS: Sloan Digital Sky Survey 

SED: Spectral Energy Distribution 

SFR: Star Formation Rate 

SKA: Square Kilometer Array 

SN: Supernovae 

SNAP: Supernova Acceleration Probe 

SNR: Signal-to-Noise Ratio 

SPT: South Pole Telescope 

STEP: Satellite Test of Equivalence Principle 

SuMIRe: Subaru Measurement of Images and Redshifts 

2SLAQ: 2dF and SDSS Large Area Quasar survey 

VMS: VISTA Hemisphere Survey 

VLBI: Very Long Baseline Interferometry 

VIKING: VISTA Kilo-Degree Infrared Galaxy Survey 

VIRGO: VIRGO gravity wave observatory 

WDS: VIMOS-VLT Deep Survey 

UKIDSS: UKIRT Infrared Deep Sky Survey 

WFC3: Wide-Field Camera 3 (on Hubble Space Telescope) 

WFPC2: Wide-Field and Planetary Camera 2 (on Hubble Space Telescope) 

WFIRST: Wide Field Infrared Survey Telescope 

WiggleZ: WiggleZ galaxy redshift survey 

WISE: Wide- field Infrared Survey Explorer 

WL: Weak Lensing 

WMAP: Wilkinson Microwave Anisotropy Probe (NASA) 
XMM-Newton: X-ray Multi-Mirror Mission 
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